Why we talk about platform engineering

Platform engineering is the practice of designing, building and operating software platforms that enable the delivery of applications and services at scale. Platform engineering is a complex and multidisciplinary field that requires a combination of technical, business and organizational skills and competencies. Platform engineering is not a new concept, but it has gained more attention in recent years as organizations face the challenges of digital transformation, cloud migration, microservices architecture, DevOps culture and data-driven decision making. Platform engineering is also a dynamic and evolving field that requires constant learning, experimentation and improvement. By talking about platform engineering, we can share our experiences, insights and lessons learned, and we can collaborate and innovate to create better platforms for ourselves and for others.

In the world of software development, there has been a growing trend towards platform engineering. Platform engineering refers to the development of technology platforms that provide a foundation for multiple applications, services, and systems. It has become increasingly important for companies to leverage platform engineering to create scalable, flexible, and efficient technology solutions.

One of the reasons why we talk about platform engineering is because it helps us learn from the success stories of companies like Netflix, which has built a highly resilient, scalable and innovative platform that supports its streaming service and content production. Netflix’s platform engineering team is responsible for providing the core infrastructure, tools and frameworks that enable the development, deployment and operation of hundreds of microservices across multiple regions and clouds. Netflix’s platform engineering team also fosters a culture of experimentation, feedback and learning, which allows the company to continuously improve its platform and deliver value to its customers.

Another reason why we talk about platform engineering is because it helps us understand the main principles and best practices that guide the platform engineering discipline. Some of these principles are:

  • Layers of platforms: Platform engineering involves creating different layers of platforms that serve different purposes and audiences. For example, a platform layer can provide the foundational infrastructure and services, such as compute, storage, networking, security and monitoring. Another platform layer can provide the application development and delivery capabilities, such as code repositories, pipelines, testing, deployment and observability. A third platform layer can provide the domain-specific functionality and business logic, such as user interfaces, APIs, data processing and analytics. Each platform layer should be modular, composable and interoperable, and should expose clear and consistent interfaces to the consumers of the platform.
  • Dynamic layers moving: Platform engineering also involves adapting and evolving the platform layers according to the changing needs and demands of the platform consumers and the market. Platform engineering should not be a static or rigid process, but a dynamic and flexible one, that allows the platform layers to move up or down the stack, or across different clouds or regions, as needed. For example, a platform layer that provides a specific functionality or service can be moved up the stack to become a higher-level abstraction or a reusable component, or it can be moved down the stack to become a lower-level implementation or a specialized service. Similarly, a platform layer can be moved across different clouds or regions to leverage the best features or capabilities of each cloud provider or to optimize the performance or availability of the platform.
  • User driven interfaces: Platform engineering should also focus on creating user driven interfaces that enable the platform consumers to easily and effectively use the platform capabilities and services. User driven interfaces can include graphical user interfaces (GUIs), command-line interfaces (CLI), application programming interfaces (APIs), software development kits (SDKs) or any other means of interaction or communication with the platform. User driven interfaces should be designed with the user’s needs, preferences and expectations in mind, and should provide a simple, intuitive and consistent user experience. User driven interfaces should also provide feedback, guidance and documentation to the user, and should allow the user to customize, configure and control the platform as desired.
  • Differentiation between internal and external platforms: Platform engineering should also recognize the differentiation between internal and external platforms, and the implications of each type of platform. Internal platforms are platforms that are built and used within an organization, and are typically aimed at improving the efficiency, productivity and quality of the internal processes, workflows and operations. External platforms are platforms that are built and offered to external customers, partners or stakeholders, and are typically aimed at creating value, differentiation and competitive advantage in the market. Internal and external platforms may have different goals, requirements, constraints and challenges, and may require different strategies, approaches and techniques to design, build and operate them.

For platform engineering to be a success, a number of teams need to collaborate effectively. Some of the key teams involved in platform engineering include:

  • UX Design team: The UX design team is responsible for creating user-centered designs that provide a seamless and intuitive experience for the users. They work with the development team to ensure that the platform meets the needs of the users.
  • Stakeholder teams: Stakeholder teams include the teams that use the platform and those that are impacted by its development and operation. They provide valuable feedback and insights that help to guide the development and operation of the platform.
  • Product management team: The product management team is responsible for setting the strategic direction of the platform. They work with the other teams to ensure that the platform is aligned with the overall business strategy and goals.
  • Security team: The security team is responsible for ensuring that the platform is secure and that sensitive data is protected. They work with the development team and the operations team to implement appropriate security measures and to respond to any security incidents.
  • Operations team: The operations team is responsible for the day-to-day running of the platform. They ensure that the platform is available, secure, and performing optimally. They work with the development team to resolve any issues that arise and to plan for future growth.
  • Development team: The development team is responsible for building and maintaining the platform. They work closely with the other teams to ensure that the platform is scalable, flexible, and efficient.

By collaborating effectively, these teams can work together to build a platform that meets the needs of the business and its users. This requires clear communication, a shared vision, and a willingness to work together to achieve common goals. When these teams work together effectively, platform engineering can be a powerful tool for driving business success.

How do Conway’s law work?

Conway’s Law is a concept in software engineering that states that the design of a system (e.g., a software system) is strongly influenced by the social and communication structures within the organization that produced it. Essentially, the law asserts that the structure and behavior of a system will reflect the structure and communication patterns of the organization that created it.

One example of Conway’s Law in action is the well-known case of IBM’s System/360 mainframe computer in the 1960s. The System/360 was a large and complex project involving many different teams and departments within IBM. The structure of the teams and their communication patterns had a significant impact on the design of the System/360, and the end result was a system that was highly modular and easy to modify, reflecting the hierarchical and centralized structure of IBM’s organization.

Another example can be seen in the design of many large software systems, where the design of the system often reflects the communication and collaboration patterns of the development team. For instance, if a development team is organized into smaller, self-contained units, the design of the software may also be organized into modular components that can be developed and tested independently. On the other hand, if the team is more centralized and focused on shared goals, the design of the software may be more tightly integrated, reflecting the more cooperative structure of the team.

Google has a decentralized and flat organizational structure, with a focus on innovation and experimentation. This is reflected in its development process, which emphasizes small, autonomous teams that are able to quickly prototype and iterate on new ideas. Google’s products, such as search and Gmail, are known for their simplicity and ease of use, reflecting the company’s focus on user experience and accessibility.

Microsoft, on the other hand, has a more hierarchical and centralized structure, with a focus on delivering enterprise-level products and services. This is reflected in the design of its software, which is often complex and feature-rich, designed to meet the needs of large businesses and organizations. Microsoft’s development process is also more structured, with a greater emphasis on process and planning.

Apple is well-known for its design-focused culture and its tightly integrated product line. Apple’s products, such as the iPhone and Mac, are known for their sleek design and seamless user experience. This reflects the company’s focus on design and user experience, and its organizational structure, which is centralized and focused on delivering high-quality, integrated products.

Oracle, as a provider of enterprise software and database solutions, has a highly centralized and structured organizational structure, with a focus on delivering robust and scalable products. This is reflected in the design of its software, which is often complex and feature-rich, designed to meet the needs of large businesses and organizations. Oracle’s development process is also highly structured, with a focus on delivering products that are reliable, secure, and scalable.

In short, Conway’s Law highlights the importance of considering the organizational structure and communication patterns within an organization when designing a system, as these factors can have a significant impact on the design and success of the final product.

Why do we keep refactoring code?

Why do we, developers, keep refactoring religiously, if not to keep our code base clean?

As a software engineer, the code base is your kitchen and the code you write is the food you cook. Just like a chef, it is important to keep your code base clean and well-maintained in order to produce high-quality code. If you have ever seen a great chef cooking, you will notice a pattern: every few chops or minutes, the chef takes time to clean up their station. This may seem like a small task, but it is critical to maintaining a good work environment and producing high-quality food. The same principles apply to software engineering.

A messy code base is like a Pandora’s box waiting to be opened. It can be difficult to find anything in a mess and even harder to get anything good out of it. This can lead to decreased team morale and a frustrating work environment. On the other hand, working on a well-maintained code base can be a joy. It feels good to come to work and work on clean, organized code.

However, just like a chef’s kitchen, code bases can become messy over time. It is important to regularly refactor your code to keep it clean and maintainable. Refactoring is the process of restructuring existing code without changing its behavior. This can involve removing redundant code, improving the organization of the code, or improving performance.

One tool that can be helpful in refactoring is a software solution such as CodeScene, which provides automated analysis of your code base and identifies potential refactoring opportunities. Another option is the popular code editor, Visual Studio Code, which has a built-in refactoring tool that allows you to make changes to your code quickly and efficiently. Same applies for Visual Studio, Eclipse, IntelliJ and more.

For example, let’s say you have a code base with multiple functions that are similar in structure and functionality. You can refactor this code by creating a single function that can be used by all the others, reducing the amount of redundant code and making it easier to maintain in the future.

Refactoring can be a time-consuming task, but it is well worth the effort in the long run. A clean code base is easier to understand, easier to maintain, and reduces the risk of bugs and other issues. It can also improve performance by reducing the amount of redundant code and making it easier to identify and resolve performance bottlenecks.

When refactoring, it is important to follow best practices and industry standards. This includes writing clean and readable code, using meaningful names for variables and functions, and following a consistent code style. Additionally, it is important to thoroughly test your code after refactoring to ensure that the changes you made did not introduce any new bugs or issues.

It is also a good idea to use version control systems such as Git when refactoring. This allows you to track the changes you make to your code and revert back to an earlier version if necessary. Additionally, version control systems make it easier to collaborate with other developers on a code base, as changes made by one developer can be easily reviewed and approved by others.

In conclusion, refactoring is an important part of software development that should not be overlooked. By keeping your code base clean and well-maintained, you can improve the quality of your code, increase team morale, and make your work environment more enjoyable. Use best practices, industry standards, version control systems, and software solutions to simplify the process and ensure the success of your refactoring efforts.

In conclusion, refactoring is an important process for software engineers to keep their code base clean and maintainable. Just like a chef cleaning up their kitchen, refactoring can improve the work environment, increase team morale, and lead to higher-quality code. Consider using software solutions and built-in refactoring tools in code editors to simplify the process and make it easier to keep your code base in top shape.

Many use cases do help selling the value of refactoring:

  • Improving Code Readability: Refactoring can help make your code more readable and understandable, making it easier for others to understand and maintain.
  • Reducing Code Duplication: Refactoring can help reduce the amount of redundant code in your code base, making it easier to maintain and reducing the risk of bugs.
  • Improving Performance: Refactoring can improve the performance of your code by removing redundant code and optimizing the use of resources.
  • Preparing for New Features: Refactoring can help prepare your code base for new features, making it easier to add new functionality without introducing bugs or other issues.

In conclusion, refactoring is a crucial part of software development that can help improve the quality of your code, increase team morale, and make your work environment more enjoyable. Use real-world examples, recommended software tools, and best practices to ensure the success of your refactoring efforts.

How INCUP helps you with ADHD?

Do you know what INCUP is? The combination of Interest, Novelty, Challenge, Urgency, and Passion. So how does INCUP help with and motivates people with ADHD?

  • Interest: When people with ADHD are engaged in activities that interest them, they are more likely to be motivated and focused. By identifying their unique interests and passions, they can direct their attention towards pursuits that hold their attention and help to manage their symptoms. Identifying a person’s interests and passions can be a key factor in helping them manage their ADHD symptoms. When someone is engaged in an activity that they are truly interested in, they are more likely to be focused, motivated, and less impulsive. For example, someone who is interested in art might benefit from taking an art class, or someone who is interested in nature might benefit from going on regular hikes.
  • Novelty: Introducing new and novel elements into activities can help to keep people with ADHD engaged and motivated. This can help to combat boredom, increase attention, and reduce impulsivity. This can be done by introducing new tools, materials, or techniques, or by exploring new and different environments. For example, a person might try a new sport or hobby, or change up their daily routine by taking a different route to work.
  • Challenge: Setting challenging but achievable goals can help to motivate people with ADHD to stay focused and engaged. The sense of accomplishment that comes with meeting a challenge can be particularly motivating for this population. However, it’s important to ensure that the challenge is achievable, so that the person doesn’t become discouraged. For example, a person might take on a challenging project at work or set a goal to learn a new skill.
  • Urgency: Creating a sense of urgency around completing tasks can help to motivate people with ADHD. This can be done by setting deadlines, using timers, or breaking down larger projects into smaller, more manageable tasks. This can be done by setting deadlines, using timers, or breaking down larger projects into smaller, more manageable tasks. For example, a person might set a deadline for completing a project or the use of a timer (see Pomodoro method) to stay focused during a task.
  • Passion: Pursuing passions and interests can be especially motivating for people with ADHD. By engaging in activities that they are truly passionate about, they can direct their energy and attention towards something that is meaningful to them. This can help to improve their overall well-being and reduce symptoms of ADHD. For example, someone who is passionate about music might benefit from playing an instrument, or someone who is passionate about cooking might benefit from trying out new recipes.

Overall, the five elements of INCUP can be used to help people with ADHD stay motivated, focused, and engaged in their pursuits. By incorporating these elements into their daily activities, they can improve their attention, reduce impulsivity, and manage their symptoms in a more positive and effective way.

But INCUP is not alone – there are several well-known frameworks used to help individuals with attention deficit hyperactivity disorder (ADHD) manage their symptoms and improve their quality of life. Some of the most commonly used frameworks include:

  • Executive Functioning Skills: Executive functioning skills refer to the mental processes that are responsible for planning, organizing, initiating, and completing tasks. People with ADHD often struggle with these skills, which can make it difficult for them to manage their daily lives. There are various frameworks and tools available to help individuals with ADHD improve their executive functioning skills, such as creating to-do lists, breaking down tasks into smaller steps, and using calendars and reminders.
  • Cognitive Behavioral Therapy (CBT): CBT is a type of therapy that focuses on changing negative thought patterns and behaviors. It has been shown to be effective in treating a variety of mental health conditions, including ADHD. CBT can help individuals with ADHD improve their self-esteem, reduce impulsiveness, and manage symptoms such as inattention and hyperactivity.
  • Mindfulness: Mindfulness is a form of meditation that involves focusing on the present moment without judgment. It has been shown to be effective in reducing stress and anxiety and can help individuals with ADHD to be more focused and present in the moment.
  • Mind Mapping: Mind mapping is a visual tool that can help individuals with ADHD to better organize their thoughts and ideas. It involves creating a diagram that represents the relationships between different ideas and concepts. Mind mapping can help people with ADHD to focus their attention, prioritize tasks, and improve their memory.
  • Time Management Techniques: People with ADHD often struggle with time management, which can lead to disorganization, procrastination, and stress. There are several time management techniques that can help individuals with ADHD to be more productive and efficient, such as creating a daily schedule, using timers, and breaking down tasks into smaller steps.

We focused on ADHD here, but in addition to attention deficit hyperactivity disorder (ADHD), several other conditions that have symptoms similar to ADHD and may be helped by the use of these frameworks include:

  • Attention Deficit Disorder (ADD): ADD is a subtype of ADHD that is characterized by inattentiveness and difficulty focusing, but without significant hyperactivity or impulsiveness.
  • Dyslexia: Dyslexia is a learning disorder that affects an individual’s ability to read and comprehend written text. It can be accompanied by symptoms similar to ADHD, such as difficulty paying attention, forgetfulness, and impulsiveness.
  • Autism Spectrum Disorder (ASD): Autism Spectrum Disorder is a developmental disorder that affects communication, social interaction, and behavior. It can be accompanied by symptoms similar to ADHD, such as inattention, hyperactivity, and impulsiveness.
  • Oppositional Defiant Disorder (ODD): Oppositional Defiant Disorder is a condition characterized by defiant, disobedient, and hostile behavior towards authority figures. It can be accompanied by symptoms similar to ADHD, such as impulsiveness and difficulty focusing.
  • Anxiety Disorders: Anxiety Disorders are a group of mental health conditions characterized by excessive worry, fear, and stress. Some individuals with anxiety disorders may also experience symptoms similar to ADHD, such as restlessness, impulsiveness, and difficulty paying attention.

These frameworks and techniques can be adapted and used to help individuals with these and other conditions manage their symptoms and improve their overall quality of life. It’s important to work with a mental health professional to determine the best approach for each individual’s unique needs and circumstances.

The value of IT patents in the world of open source

Patents in the Information Technology industry can be valuable as they protect innovative ideas and solutions to technical problems. By having a patent, a company can prevent others from using, selling, or manufacturing similar technology without their permission. This can provide a competitive advantage and help the company to establish itself as a market leader. Additionally, patents can also be licensed or sold for a profit, providing a source of revenue for the patent holder.

Patent troll attacking open source developer

However, as always, we have to be careful: the value of a patent in the IT industry can also be limited by the speed at which technology is changing and the difficulties in enforcing patents in this field. As a result, the value of patents in IT can vary greatly and it is important to carefully consider the potential benefits and limitations before investing in them.

So, as you are aware, my team are working on multiple open source projects, morganstanley/ComposeUI: A .NET Core based, WebView2 using UI Container for hybrid web-desktop applications (github.com) is among them. Applying for a patent for an open source software may seem counterintuitive, as open source software is typically made available to the public under an open source license, allowing anyone to use, modify, and distribute the software. However, there may still be some reasons why a company or an individual might choose to apply for a patent for an open source software:

  • Defense against patent trolls: Even though the software is open source, a company or individual may still apply for a patent to use as a defensive measure against patent trolls. Having a patent can help prevent others from making frivolous patent infringement claims.
  • Commercializing the software: A company or individual may choose to apply for a patent as a way to commercialize the open source software. For example, the patent can be used to offer consulting services, support, or other value-added services related to the software.
  • Protecting specific innovations: While the software is open source, there may be specific innovations within the software that the company or individual wants to protect. In this case, applying for a patent can help prevent others from using or commercializing these specific innovations.

So are there patented open source projects out there? Sure! An example of a patent for an open source software would be the “Method and System for Facilitating Electronic Transactions Over a Network” patent held by the OpenSSL Software Foundation. OpenSSL is an open source software that provides a secure way for websites to transmit data over the internet, and it is used by many websites and applications. Another example of a patent for an open source software is the “Method and System for Compression and Decompression of Data” patent held by the 7-Zip Software Foundation. 7-Zip is a free and open source file archiving software that is widely used for compressing and decompressing data. Despite being open source, the OpenSSL Software Foundation and the 7-Zip Software Foundation holds these patents to help protect the specific innovations in their software and to provide a defensive measure against patent trolls. By holding the patent, they can control how the technology is used and ensure that it remains available to the public under an open source license.

If you look from the other angel, there are other reasons why patenting open source software makes sense, like:

  • To attract investment: By holding a patent, an open source software project can demonstrate its innovation and attract investment from potential partners or investors.
  • To establish market position: Holding a patent can help establish a company or project as a market leader and give them a competitive advantage in their industry.
  • To protect against infringement: Having a patent can provide legal protection against others who might use the technology without permission.
  • To create a licensing revenue stream: An open source software project can license its patents to others, generating a new revenue stream that can be used to fund further development and improvement of the software.

It’s worth noting that these benefits of patenting open source software are not guaranteed, and each case is unique. The decision to patent open source software should be based on a careful consideration of the specific circumstances and goals of the company or individual.

Does the use of video conferencing enhance remote work?

During the pandemic, everyone tried and: enjoyed, hated (right one to be underlined) doing video conferences. I consider myself an early pioneer of the tech, back in 2010 I requested one of the first cameras in the firm to enable better collaboration between the New York, India and Budapest teams. So, does the use of video conferencing truly enhance remote work? Should you call out your college who has not turned on their camera?

So, yes, video conferencing can enhance remote work by providing a way for remote workers to have face-to-face communication, collaborate on projects in real-time, and build stronger relationships with their colleagues. Video conferencing technology helps to overcome the lack of physical proximity and provides a sense of connection and engagement that is crucial for remote teams. However, it’s important to note that the success of video conferencing in enhancing remote work also depends on various factors such as the quality of the technology, internet connection, and the cultural and organizational support for remote work.

But as most cases, there are two sides of the coin. While video conferencing can have benefits for remote work, it can also have some drawbacks. The following are some reasons why video conferencing may not enhance remote work:

  • Technical Issues: Video conferencing can be hindered by technical problems such as poor internet connectivity, audio and video quality, and compatibility issues. This can lead to frustration and decreased productivity for remote workers.
  • Lack of Privacy: Video conferencing can be intrusive and can make it difficult for remote workers to find the privacy they need to focus on their work. Additionally, remote workers may not feel comfortable having personal conversations or performing sensitive tasks on a video call.
  • Excessive Screen Time: Video conferencing can increase screen time, leading to eye strain and other physical problems. This can be especially challenging for remote workers who may be required to participate in multiple video calls throughout the day.
  • Culture & Interpersonal Challenges: Video conferencing can also create cultural and interpersonal challenges, especially for remote workers who may not be familiar with the norms and expectations of virtual communication. This can lead to miscommunication and decreased team morale.

In conclusion, while video conferencing can have benefits for remote work, it’s important to be mindful of its potential drawbacks and to take steps to minimize them in order to enhance remote work for all team members.

On the other hand (yes, it’s like a pendulum 😀 ), there are several reasons why you should turn on your camera during a remote video conference call:

  • Improved Communication: Having a visual component to the call helps to build stronger connections and facilitates better communication between participants. It allows for nonverbal cues such as facial expressions and body language to be visible, which can enhance understanding and foster a sense of presence.
  • Increased Engagement: Turning on your camera can increase engagement and participation in the call, as it allows you to be more present and attentive to the conversation.
  • Better Team Dynamics: Seeing each other on camera can help to build a more personal connection and improve team dynamics. This can be particularly important for remote teams who may not have the opportunity to interact in person.
  • Professionalism: In a professional setting, turning on your camera can demonstrate that you are fully engaged and respectful of others’ time.
  • Technical Considerations: In some cases, video conferencing software may require you to turn on your camera in order to access certain features or participate in certain activities.

Note: It’s important to consider the technical requirements and cultural norms of your team, as well as personal preferences and privacy concerns when deciding whether or not to turn on your camera during a remote video conference call. Speaking of norms, similarly how chatting have its norm (see https://www.nohello.com/ ), video conferencing etiquette is a set of guidelines that help to ensure that remote meetings are productive, professional, and respectful of everyone’s time and space. Here are some video conferencing etiquette dos and don’ts:

DosDon’ts
Test your technology before the call to ensure it is working properly.
Dress appropriately as you would for an in-person meeting.
Turn on your camera, if possible, to improve communication and engagement.
Mute your microphone when you are not speaking to reduce background noise.
Keep your surroundings tidy and professional looking.
Arrive on time and be prepared for the meeting.
Use the chat feature to share any important information or documents.
Be attentive and actively participate in the conversation.
Speak clearly and use appropriate language and tone.
End the call promptly when it has concluded.
Do not multitask during the call, it shows lack of engagement and respect.
Do not eat or drink during the call.
Do not interrupt others when they are speaking.
Do not engage in personal or off-topic conversations.
Do not use inappropriate or offensive language.
Do not share confidential or sensitive information.
Do not use a distracting background or have a messy room in the background.
Do not ignore technical difficulties or dismiss them as unimportant.
Do not use your phone or other devices during the call.
Do not end the call abruptly or without warning.

Did I commit all of the don’ts at least once in my lifetime? Most probably – yes 😀 In conclusion, by following these video conferencing etiquette dos and don’ts, you can help to ensure that remote meetings are productive, professional, and respectful of everyone’s time and space. Like, with the video on, probably even more important to have an actual agenda, have the sections start and finish on time, not to run over, etc. To share materials beforehand as you cannot be sure the device the other side connecting through would display yours adequately. To have your name and – if you wish – your pronoun on your display name – noone likes to address “iPhone9”. To try to engage more – polls, active chat using Q&A, etc tend to be useful.

Of course, this is not an area without research. Research studies (like this HBR) on the use of cameras for remote calls have shown mixed results. Some studies have found that the use of cameras in remote calls can enhance communication, build stronger relationships, and increase engagement and participation. For example, a study by the University of California, Irvine found that video conferencing improved nonverbal communication, leading to more accurate understanding and more effective collaboration.

On the other hand, some studies have shown that the use of cameras in remote calls can be intrusive and can lead to increased self-consciousness, decreased comfort, and reduced communication quality. For example, a study by the University of Haifa found that remote participants who had their cameras turned on felt more self-conscious and reported lower levels of comfort and privacy compared to participants who had their cameras turned off.

So, I don’t think there is an easy answer for this question either…

The value of HATEOAS compared to traditional REST

During a discussion on how to return values from an API, I told someone: just use HATEOAS. I had a reaction like “and what is that?”. HATEOAS (Hypermedia as the Engine of Application State) is a constraint of REST APIs that adds an extra layer of information to the responses that allow clients to dynamically discover and navigate the API. HATEOAS provides a more flexible and discoverable API compared to traditional REST queries that require clients to hardcode URLs for each endpoint.

Many people’s reaction reading the above is like “why not using GraphQL then?”. When compared to GraphQL, HATEOAS and GraphQL serve different purposes. GraphQL provides a more efficient and flexible way of querying and manipulating data from an API, allowing clients to retrieve exactly the data they need in a single request. On the other hand, HATEOAS focuses on providing a discoverable API and enabling clients to navigate the API dynamically, without having to hardcode the API’s structure.

When compared to REST, traditional REST queries typically require the client to know the exact URL of the endpoint they want to request data from, and the client needs to send separate requests to retrieve different pieces of data. This can lead to inefficiencies, as the client may need to make multiple round trips to the server to gather all the data it needs.

In contrast, HATEOAS provides additional information in the API response that allows the client to dynamically discover and navigate the API, making the API more flexible and discoverable compared to traditional REST queries. Same way, when calling an API for creation of an object, instead of returning the new ID or returning the new object with details, it would return you the REST URL instead that you can call to get details about the object – you get more than the ID, and less than the object 😀

Similarly, GraphQL provides a more efficient and flexible way of querying data compared to traditional REST queries. With GraphQL, the client can specify exactly what data it needs in a single request, reducing the number of round trips to the server and enabling the client to retrieve only the data it requires.

Overall, both HATEOAS and GraphQL offer improvements over traditional REST queries, with HATEOAS focusing on providing a more flexible and discoverable API and GraphQL providing a more efficient way of querying data.

Various languages do support easy navigation of it – however, you should note, that: this list is not exhaustive, and other libraries may also be available. Additionally, HATEOAS support can often be implemented manually in any language, as it is a design constraint rather than a specific technology.

  • C#: WebApi.Hal, HAL-CS
  • Java: Spring HATEOAS, JHAL
  • Python: Django Rest Framework, Tastypie

HATEOAS is useful in situations where the API clients need to discover and navigate the API dynamically, without having to hardcode the API’s structure. Here are a few examples and use cases:

  • SPA (Single Page Application) – An SPA is a web application that updates its content dynamically, without reloading the entire page. HATEOAS can be used to provide the necessary information to the SPA so that it can dynamically navigate and discover the API, without having to hardcode the URLs.
  • Mobile Applications – Mobile applications often have limited connectivity and may not have the latest information about the API. HATEOAS can be used to provide the necessary information to the mobile application so that it can dynamically discover and navigate the API, even when the network is slow or unreliable.
  • Microservices – In a microservices architecture, the APIs between services are often changed and updated frequently. HATEOAS can be used to provide the necessary information to the client so that it can dynamically discover and navigate the API, even when the underlying implementation changes.
  • Versioning – When a new version of an API is released, the URLs and endpoint names may change. HATEOAS can be used to provide the necessary information to the client so that it can dynamically discover and navigate the latest version of the API, without having to hardcode the URLs.

HATEOAS is a widely adopted constraint in RESTful API design and has been used by many companies and organizations. Here are a few examples – many other companies and organizations also use HATEOAS in their APIs. However, specific case studies are not readily available as the usage of HATEOAS is often part of the internal implementation details of an API and is not widely publicized.

  • Amazon Web Services (AWS) – AWS uses HATEOAS in many of its APIs, such as the Amazon S3 API, to allow clients to dynamically discover and navigate the API.
  • Netflix – Netflix uses HATEOAS in its APIs to allow clients to dynamically discover and navigate the API and to improve the discoverability and flexibility of its APIs.
  • Salesforce – Salesforce uses HATEOAS in its APIs to allow clients to dynamically discover and navigate the API and to improve the usability and efficiency of its APIs.
  • Twitter – Twitter uses HATEOAS in its APIs to allow clients to dynamically discover and navigate the API and to improve the usability and efficiency of its APIs.

“Train people well enough so they can leave. Treat them well enough so they don’t want to”

Like always, it is about balancing employee training and treatment. The sentiment behind the statement “train people well enough so they can leave, treat them well enough so they don’t want to” is one of balance and foresight. A company or organization that invests in its employees and helps them grow both professionally and personally has the potential to reap significant rewards. By training its workers well and creating a positive work environment, a business can increase its chances of retaining valuable employees and improve overall morale.

The first half of the statement, “train people well enough so they can leave,” implies that companies should help their employees acquire the skills and knowledge they need to succeed in their careers, regardless of whether they stay with the organization. By doing so, the company is providing them with opportunities for growth and advancement that will benefit both the employee and the company in the long run. This kind of investment in employee development can help establish the company as a desirable place to work and create a reputation for excellence in the industry.

The second half of the statement, “treat them well enough so they don’t want to,” highlights the importance of creating a positive work environment. Employees who feel valued and appreciated are more likely to be productive and motivated, and they are also less likely to leave the company in search of a better work environment. A business that provides its employees with a supportive and inclusive atmosphere, fair pay and benefits, and opportunities for growth and advancement is more likely to retain its best workers and attract new talent.

The key to success, then, is finding the right balance between training and treatment. A company that provides excellent training but fails to create a positive work environment is likely to experience high turnover, as employees seek better opportunities elsewhere. On the other hand, a company that provides a great work environment but fails to invest in employee development will likely struggle to retain its best workers as they seek out more challenging and rewarding career opportunities.

In conclusion, the statement “train people well enough so they can leave, treat them well enough so they don’t want to” underscores the importance of investing in employee development and creating a positive work environment. By doing so, companies can increase the chances of retaining their best workers, improve morale, and establish a reputation for excellence in the industry. And for me, this training I was lucky to give, happened to be many times the TAP

DORA and Agile – a bake off of delivery pipeline measurement techniques

Today’s post is inspired by Matt Shuster, who asked about my opinion on DORA vs Agile pipelines. So, let’s see the basics first; measuring the performance of an agile delivery pipeline requires a combination of metrics that focus on both efficiency and effectiveness. Here are a few metrics that are commonly used for this purpose:

  • Lead time: The time from when a feature is requested to when it is delivered to the customer.
  • Cycle time: The time it takes to complete a specific task or feature from start to finish.
  • Throughput: The number of features delivered per unit of time.
  • Defect density: The number of defects per unit of delivered code.
  • Deployment frequency: The frequency of code releases to production.
  • Time to restore service: The time it takes to restore service after a production failure.
  • User satisfaction: Feedback from users on the quality and functionality of the delivered features.

As with all metrics, these metrics should be regularly monitored and used to continuously improve the delivery pipeline by identifying bottlenecks, optimizing workflows, and reducing waste. Additionally, as Agile is not one-size-fits-all, it’s important to regularly reassess and adjust the metrics used to ensure they accurately reflect the goals and priorities of the organization.

On the other hand, let’s quickly look at DORA. The DORA (Accelerate) framework is a set of four metrics that provide a comprehensive view of the performance of an organization’s software delivery process. The four metrics are:

  • Lead time: The time it takes to go from code committed to code successfully running in production.
  • Deployment frequency: The number of times per day that code is successfully deployed to production.
  • Mean time to recovery: The average time it takes to restore service after an incident.
  • Change failure rate: The percentage of changes that result in a production failure.

These metrics align well with the metrics commonly used to measure the performance of an agile delivery pipeline and can be used in a complementary manner to validate the software architecture. For example, a low lead time and high deployment frequency indicate that the delivery pipeline is efficient and streamlined, while a low change failure rate and mean time to recovery indicate that the architecture is robust and reliable.

I promised a bake off, so, here we are 🙂 The comparison between using metrics to validate a software architecture and using the DORA framework is that both provide different but complementary perspectives on the performance of an organization’s software delivery process.

On one hand, metrics such as lead time, cycle time, throughput, and defect density focus on efficiency and effectiveness of the delivery pipeline. They help to measure the time taken to complete a task, the speed at which features are delivered, and the quality of the delivered code. These metrics provide insight into the processes and workflows used in the delivery pipeline and help identify areas for improvement.

On the other hand, the DORA framework provides a comprehensive view of the performance of an organization’s software delivery process by focusing on four key metrics: lead time, deployment frequency, mean time to recovery, and change failure rate. These metrics help to measure the speed and reliability of the delivery pipeline and provide insight into the resilience and stability of the software architecture.

So, which of them to use? By using both sets of metrics together, organizations can get a complete picture of their delivery pipeline performance and identify areas for improvement in both architecture and processes. This can help ensure that the architecture supports the needs of the organization and the goals of the delivery pipeline, while also providing a way to continually assess and optimize performance over time. For example, metrics such as lead time and cycle time can highlight bottlenecks and inefficiencies in the delivery pipeline, while metrics such as change failure rate and mean time to recovery can highlight weaknesses in the architecture that may be contributing to production failures.

In summary, using metrics to validate a software architecture and using the DORA framework together provides a comprehensive view of the performance of an organization’s software delivery process and helps to identify areas for improvement in both architecture and processes. As probably you figured out, I like case studies and tools, so… here we are 🙂

  • Netflix: Netflix uses a combination of metrics, including lead time and cycle time, to measure the performance of its delivery pipeline. They use this data to continuously optimize their processes and improve their architecture, resulting in a highly efficient and effective delivery pipeline.
  • Amazon: Amazon uses a combination of metrics, including deployment frequency and mean time to recovery, to measure the performance of its delivery pipeline. By regularly monitoring these metrics, Amazon has been able to achieve a high level of reliability and stability in its software architecture, allowing them to quickly and effectively respond to incidents and restore service.
  • Spotify: Spotify uses a combination of metrics, including lead time and throughput, to measure the performance of its delivery pipeline. By using these metrics to continuously optimize their processes and improve their architecture, Spotify has been able to increase the speed and efficiency of its delivery pipeline, allowing them to deliver high-quality features to users faster.
  • Google: Google uses a combination of metrics, including lead time, deployment frequency, and mean time to recovery, to measure the performance of its delivery pipeline. By using these metrics to continuously improve its processes and architecture, Google has been able to achieve a high level of reliability and stability in its delivery pipeline, allowing it to deliver high-quality features and updates to users quickly and efficiently.
  • Microsoft: Microsoft uses a combination of metrics, including lead time and cycle time, to measure the performance of its delivery pipeline. By using these metrics to continuously optimize its processes and improve its architecture, Microsoft has been able to increase the speed and efficiency of its delivery pipeline, allowing it to deliver high-quality features and updates to users faster.
  • Shopify: Shopify uses a combination of metrics, including deployment frequency, mean time to recovery, and change failure rate, to measure the performance of its delivery pipeline. By using these metrics to continuously improve its processes and architecture, Shopify has been able to achieve a high level of reliability and stability in its delivery pipeline, allowing it to deliver high-quality features and updates to users quickly and efficiently.
  • Airbnb: Airbnb uses a combination of metrics, including lead time, deployment frequency, and mean time to recovery, to measure the performance of its delivery pipeline. By using these metrics to continuously improve its processes and architecture, Airbnb has been able to achieve a high level of reliability and stability in its delivery pipeline, allowing it to deliver high-quality features and updates to users quickly and efficiently.

These case studies demonstrate the importance of regularly measuring and analyzing performance metrics to validate a software architecture and improve the delivery pipeline. By using a combination of metrics and regularly reassessing and adjusting their approach, organizations can continuously improve their delivery pipeline and ensure that their architecture supports the needs of the organization and the goals of the delivery pipeline. And speaking of tools – there are various tools and software that can be used to measure the DORA framework measures. Some popular options include:

  • Datadog: Datadog provides real-time monitoring and analytics for cloud-scale infrastructure, applications, and logs. It can be used to track key performance indicators, including lead time, deployment frequency, mean time to recovery, and change failure rate, and generate reports and alerts based on that data.
  • New Relic: New Relic is a performance management platform that provides real-time visibility into application performance. It can be used to track and analyze key performance indicators, such as lead time, deployment frequency, and mean time to recovery, and generate reports and alerts based on that data.
  • Splunk: Splunk is a software platform for searching, analyzing, and visualizing machine-generated big data. It can be used to track and analyze key performance indicators, such as lead time, deployment frequency, and mean time to recovery, and generate reports and alerts based on that data.
  • AppDynamics: AppDynamics is an application performance management solution that provides real-time visibility into the performance of applications and infrastructure. It can be used to track and analyze key performance indicators, such as lead time, deployment frequency, and mean time to recovery, and generate reports and alerts based on that data.
  • Prometheus: Prometheus is an open-source systems monitoring and alerting toolkit. It can be used to track and analyze key performance indicators, such as lead time, deployment frequency, and mean time to recovery, and generate reports and alerts based on that data.
  • InfluxDB: InfluxDB is an open-source time series database. It can be used to track and analyze key performance indicators, such as lead time, deployment frequency, and mean time to recovery, and generate reports and alerts based on that data.
  • Grafana: Grafana is an open-source data visualization and analysis platform. It can be used to track and analyze key performance indicators, such as lead time, deployment frequency, and mean time to recovery, and generate reports and alerts based on that data.
  • Nagios: Nagios is an open-source IT infrastructure monitoring solution. It can be used to track and analyze key performance indicators, such as lead time, deployment frequency, and mean time to recovery, and generate reports and alerts based on that data.
  • JIRA: JIRA is a project and issue tracking software. It can be used to track the lead time and cycle time of the delivery pipeline by monitoring the time it takes for work items to move through the various stages of the development process.

These are just a few examples of software tools that can be used to measure and track DORA framework metrics. The specific tool or combination of tools used will depend on the needs of the organization and the size and complexity of the delivery pipeline. For agile, we have our set of tools too, I picked only some of them (and yes, JIRA is on both lists):

  • Trello: Trello is a visual project management tool that can be used to track and visualize the progress of work items through the different stages of the development process.
  • Asana: Asana is a team collaboration tool that can be used to track and visualize the progress of work items through the different stages of the development process.
  • JIRA: JIRA is a project and issue tracking software that can be used to track and visualize the progress of work items through the different stages of the development process.
  • Clubhouse: Clubhouse is a project management tool specifically designed for agile teams. It can be used to track the progress of work items through the different stages of the development process and visualize the flow of work through the delivery pipeline.
  • Pivotal Tracker: Pivotal Tracker is an agile project management tool that can be used to track and visualize the progress of work items through the different stages of the development process.

I hope this helped answering the DORA vs Agile metrics question, with the answer being:

How to avoid the cloud hosting price creep?

Some call it penny-pinching, some call it watching my cash-flow, but everyone is trying to avoid the price creep that might happen if you are using the cloud. An honest mistake can lead to terrible effects – and if you don’t know your own architecture well enough, this can be disastrous. So, here are some strategies to effectively avoid creeping cloud hosting prices:

  • Monitor usage and costs regularly: Use tools such as Azure Cost Analyzer, AWS Cost Explorer or Google Cloud’s billing dashboard to keep track of resource utilization and costs. This will help you identify areas where you can reduce spending, such as underutilized instances or overpriced storage services.
  • Use reserved instances or committed use contracts: These provide a significant discount compared to on-demand pricing by committing to use a specific amount of resources for a set period of time, e.g., 1 or 3 years.
  • Take advantage of spot instances: Spot instances are unused Azure or EC2 instances that are made available at a discount to users willing to take the risk of having their instances terminated when the spot price increases. Yes, it needs to think through your microservices infrastructure and your circuit breakers, but the savings can be tremendous.
  • Optimize your infrastructure: Right-sizing instances, using auto-scaling groups, and using managed services like Azure Managed Cloud Database Service or AWS RDS instead of running your own database servers can help reduce costs.
  • Use managed services: For me, this is probably the biggest pet peeve: Using managed services like Azure Functions, AWS Lambda or Google Cloud Functions instead of running your own servers can greatly reduce costs and eliminate the need for infrastructure management. Believe or not, you would not be able to optimize your custom solutions, let it be written by you or using some 3rd party solution or similar, to be as cost effective as it has been done by the actual CSP – they are interested in making it cost effective, to be able to handle a better compression ratio on the same hardware, it would be for sure be compatible with other services, authentication would work out of the box, etc.

You can use a combination of various tools to achieve this, just look at these examples:

  • Dropbox: Dropbox was able to save over 30% on its cloud hosting costs by using reserved instances and spot instances to reduce its spending on compute resources. In addition, Dropbox optimized its infrastructure to make it more cost efficient, including reducing the number of instances it was using, right-sizing its instances, and using auto-scaling groups to ensure that its resources were being used optimally. This combination of strategies allowed Dropbox to reduce its overall cloud hosting costs while maintaining the same level of service and reliability.
  • Capital One: Capital One reduced its cloud hosting costs by over 50% by using a combination of reserved instances, managed services, and a cost optimization program. Capital One adopted a proactive approach to cost optimization, monitoring its cloud usage and costs on a regular basis, and implementing cost optimization strategies when necessary. For example, Capital One used reserved instances to commit to a certain amount of resources over a set period of time, which allowed it to receive significant discounts compared to on-demand pricing. In addition, Capital One adopted managed services like AWS RDS and AWS Lambda to reduce the amount of infrastructure it needed to manage and maintain.
  • Expedia: Expedia reduced its cloud hosting costs by 20% by using reserved instances, auto-scaling, and by optimizing its infrastructure for cost efficiency. Expedia adopted a multi-pronged approach to cost optimization, which included committing to a certain amount of resources over a set period of time through reserved instances, using auto-scaling to ensure that its resources were being used optimally, and right-sizing its instances to reduce the amount of resources it was using. These strategies allowed Expedia to reduce its cloud hosting costs while maintaining the same level of service and reliability.
  • SoundCloud: SoundCloud reduced its cloud hosting costs by over 50% by moving from on-demand instances to reserved instances and by optimizing its infrastructure for cost efficiency. By using reserved instances, SoundCloud was able to commit to a certain amount of resources over a set period of time, which allowed it to receive significant discounts compared to on-demand pricing. In addition, SoundCloud optimized its infrastructure to reduce the amount of resources it was using and to ensure that its resources were being used optimally, which allowed it to further reduce its cloud hosting costs.
  • SmugMug: SmugMug reduced its cloud hosting costs by over 60% by using reserved instances, spot instances, and by optimizing its infrastructure for cost efficiency. SmugMug adopted a multi-pronged approach to cost optimization, which included using reserved instances to commit to a certain amount of resources over a set period of time, using spot instances to take advantage of unused EC2 instances that were made available at a discount, and optimizing its infrastructure to reduce the amount of resources it was using and to ensure that its resources were being used optimally. These strategies allowed SmugMug to reduce its cloud hosting costs while maintaining the same level of service and reliability.
  • Netflix: reduced its cloud hosting costs by over 80% through a combination of reserved instances, spot instances, and optimizing its infrastructure.

Of course, as always, it is important to note that these strategies and tools will vary based on the specific cloud provider and your use case, so it is important to carefully evaluate your options and choose the best solution for your needs. However, these case studies provide a good starting point and demonstrate the potential savings that can be achieved through a proactive approach to cloud hosting cost optimization.