Posts tagged "opensource":

A risk-based approach to open source strategy

19 Apr 2023

On it's face, using open source code is an inherently risky endeavor. We are trusting external developers to write code that we'll eventually deliver to our users, and we have no recourse if the code is buggy or malicious in some way. In practice, people are generally good and the forces that govern popularity of open source projects help reduce this risk to the point where nearly all folks in the industry are prolific consumers of open source code.

One problem with the current approach is that the evaluation of risk is something that happens once up front, but rarely thereafter. When selecting a library, folks look at alternatives for the same functionality in hopes of selecting a robust solution. They may also look at the issue trackers to validate that bugs are addressed in a timely manner. Once the selection is made, these metrics aren't revisited. This presents an issue because projects, ecosystems and communities aren't static. What was a well-maintained project, may have fallen into disrepair as the primary author finds other focuses.

That level of evaluation and re-evaluation is something that happens on a project-by-project basis. When dealing with companies with a large open-source usage footprint, this is a problem that needs to be addressed at scale. Today, we don't have a term for "the pile of open source code I rely on" beyond perhaps "dependencies". I've been thinking about it in terms of a "consumption portfolio". We consume a great deal of code. Much like a stock portfolio, there are a mix of risk profiles there and expected change trajectories. Innovation and stability are qualities that we can monitor and shape over time depending on the needs of our business, just like we might adjust our risk vs reward balance in a financial portfolio.

Moving forward, organizations interested in tracking the risk of their consumption portfolio should begin by gathering an inventory. Thanks to the push by various national governments, Software Bill of Materials (SBOMs) are becoming more mainstream. SBOMs allow organizations who depend on open source to capture their dependencies (and, transitive dependencies, depending on how they're setup). The result is a rich dataset of software composition, which we can use to drive our understanding of our consumption portfolio. There are a variety of additional tools which can increase the utility of this data, such that it benefits those interested in software composition as well as security vulnerability management, license compliance, and a host of other regulatory/compliance interests.

Once we have the data store with the relevant source data, we can begin to conduct data analysis of each of these dependencies. These dependencies each exist on a spectrum of strategic/not and popular/not. Strategic, in this context, means that it is used within a large portion of your group's applications. Because of this ubiquitous usage, any risk in these projects will have a disproportionate impact on our organization. "Popular" in this case is a placeholder for the community stability of a project. It's certainly an imperfect word for this, but the general sense is that "popular" projects have the time/people/resources they need to be healthy.

strategic/popular    |  one-off / popular
(safe/boring)        |  (legacy replacement)
                     |
                     |
---------------------+-------------------
                     |
                     |
(risky/legacy)       |  (necessary, uninteresting)
strategic/unpopular  |  one-off / unpopular

Tools which are strategic and popular may be foundational things like the Java language or the Django web framework. These have broad use within the industry and are probably not at existential risk due to lack of involvement from their users. Projects which are strategic but not well funded are risky. These are likely to be future legacy dependencies in the organization. Tools which are popular but aren't in wide use in our data set may be upcoming bets on the next "strategic" tool. Regardless, given their popularity, they are generally safe dependencies to have. Dependencies which are neither popular nor necessary are just dependencies which are in use by a few teams and don't represent any sort of broader pattern within the organization. These are safe to ignore for now, though we may look to move towards more popular replacement if they start to gain traction internally.

Strategic and unpopular dependencies, because of their ubiquity in the organization, will be expensive to move away from. We should actively be working to reduce risk in this area. We can do this is by shedding the internal use-case that this project supports, migrating to more popular alternatives, or putting in abstraction layers to reduce future switching costs. Alternatively, we could contribute resources (money, developers) to ensure the long-term health of that project's ecosystem.

If you are working on this problem, let me know. I'm looking to deepen my involvement in these areas (and I'm currently for hire).

Thank you to Alex Scammon, Van Lindberg, John Benninghoff for the underlying ideas that triggered this post. Thank you to Julia Ferraioli and Vijay Samuel for their reviews.

Feature Flags, Dynamic Config and Experimentation (oh my!)

15 Nov 2022

As part of the OpenFeature project, I've been thinking a bunch about feature flags. There's been ambiguity about how feature flags differ from dynamic config and if that's the same thing as experimentation (e.g. A/B testing).

At their core, feature flags are really fancy if statements. Those if statements control the behavior we want to manage for our application.

// example OpenFeature java code
if (client.getBoolean("should-show-redesign", /* default */ false)) {
  // ...
} else {
  // ...
}

That may be backed by a file or it may be backed by some giant distributed system. That's a serialization concern that's separate from the use of "gimmie this config value". This brings up the question about how this relates to config in general. Should we use something like to fetch environment variables? system properties?

From my perspective, an abstraction layer like OpenFeature can be used to hold all config. We should be able to generate a backing provider that checks locally for a config before falling back to slower, remote sources. This allows the serialization of your config to migrate to the places it makes sense at the time. For local development, environment variables may be fine. As you move into productionization and need more fine-tuning, you can migrate towards a proper feature management platform and do things like "show the redesign to users who have purchased more than $30 in the past 10 days" or other powerful targeting. This puts OpenFeature squarely into being a potential client of dynamic config systems.

When feature flags are mentioned, experimentation (and A/B tests in particular) aren't far behind. I believe that experimentation requires feature flagging. Rephrased, it's a necessary pre-condition to control a user's experience on a per-session/request basis, even if you don't expose that to users directly. Second, I think that the main difference between a "feature flag" and "experiment" is statistics and tracking. To determine if variant A is better than variant B, we need to record data about whether a user saw variant A or B. Then, we can correlate that with whether they achieved the outcome that we wanted. If we combine that with statistical rigor, we get an experiment result.

OpenFeature is decidedly not trying to solve that experimentation side. We are hoping to provide relevant extension points so that you can emit the "user saw variant A" type events. From there, you're on your own with the statistics!