Results

MTA - Metropolitan Transportation Authority

03/10/2026 | News release | Distributed by Public on 03/10/2026 15:04

Lessons learned in managing the MTA’s Open Data program

The MTA has a vast library of data covering a wide range of topics including service performance, budget, capital assets, workforce, and more-all of which is valuable to the public. But it wasn't always this way. Before 2021, most MTA data was only released in non-open formats, such as PDFs, and through Freedom of Information Law (FOIL) requests (except for real-time vehicle arrival data feeds, which we have published in open format since 2012). Ever since Governor Kathy Hochul and New York State Legislature enacted the MTA Open Data Law in 2021, the MTA's open data program has grown rapidly. In fact, it's now an award-winning program that can serve as a model for other government agencies standing up their own open data programs.

Good open data programs improve government transparency and accountability by making data easier to understand and work with for riders, researchers, and curious customers. Equally important, open data is crucial for making data easier to work with internally. You can read more about the tangible, organizational benefits to better managed and more accessible data in this previous lessons learned blog post.

In this next lessons learned post, we offer a guide on how to launch an open data program in government, using best practices we learned in the first years of establishing our program. The MTA's open data program has come a long way in five years, and we hope this resource helps other government agencies across the United States follow suit!

Why we're qualified

2021 was a turning point for MTA open data. Before then, data was regularly shared publicly in PDF or static Excel files. We only had 21 published open data sets available to the public.

The passage of the MTA Open Data Law in 2021 officially mandated the MTA to publish its data in open, machine-readable formats. Since then, our offerings have increased, with over 230 open data assets available to the public today. And we've conquered some technically complex and high-profile topics too, like the operating budget (the first transit agency in the United States to do so!), subway rider origin-destination ridership, and traffic counts in the Manhattan congestion relief zone.

The role of an open data mandate

As mentioned above, the governor and state legislature passed the MTA Open Data Law in 2021, officially requiring the MTA to publish information in an open format. For the MTA, this mandate was the catalyst for establishing a formal program and freeing up the resources needed to get the program off the ground. The MTA's open data law contains language that is common across many open data policies and statutes. A few pieces of the legislation were particularly helpful in establishing how the program would work:

  • Designate a data coordinator, who is responsible for managing the program and complying with the law.
  • Create a three-year plan to release all publishable data. This established a firm time-bound goal to keep things moving.
  • Define a portal to contribute to, which in our case, was New York State's open data portal at data.ny.gov. While some agencies may need to create a portal from scratch, we could plug into a data hub that already existed, allowing us to focus more effort on other key parts of the program.

A mandated annual report to document progress is common in other open data statutes. While not included in our statutory requirements, we still publish an annual update on our program page, which we have found incredibly valuable in memorializing our success.

Key players

Identifying who needs to be part of the immediate team, and who needs to be engaged in the surrounding circle, is one of the first steps to starting a successful open data program. Below is a description of each contributor:

  • Program manager
    • The program manager leads the program's day-to-day operations and is the principal technical liaison between data owners and data users. They set the direction for what datasets will be published and manage the work it takes to make a dataset open.
  • Data scientist(s)
    • Data scientists write the code that produces datasets for public consumption. Some datasets require complex transformations from raw data sources before they have any analytical value; data scientists are the experts in the codebase that leads to these products.
  • Data engineer(s)
    • Data engineers architect the tooling required to automate the creation of the datasets from source systems, as well as the push (or pull) of the data to a publicly accessible portal. You can read more about the tooling we use in this blog post.
  • Internal leadership
    • Executive level buy-in and active sponsorship from someone at the deputy or assistant executive level is crucial. They can help prioritize datasets for release to ensure alignment with the organization's strategic goals. A champion at the executive level also helps move things along when they get stuck, especially if open data is new to your organization.
    • Leadership in legal and communications offices is helpful as well. Legal serves as an important safeguard in the review process before anything is released and can connect the dots between datasets published to open data and information frequently FOILed. On the communications side, they can support the program by directing the press to open data in order to self-serve, and they can make sure that messaging in dataset documentation and accompanying resources aligns with other agency communications.
  • Internal users
    • Internal users are analysts within the organization that use datasets on the open data portal. It is more than likely, especially at the beginning stages of an open data program, that the open data portal will be the most efficient tool for cross-departmental data sharing that your organization will have.
  • External users and advocates
    • External users and advocates are people outside of the organization that use datasets on the open data portal. The technical skill range of these users varies widely, and it is important to consider how users of different analytic levels will engage with your program.

Identifying and preparing your first datasets for release

When identifying and preparing your first datasets for release, we recommend prioritizing easy wins, such as data that is already made public (but not open) in some form or data that is frequently requested through the FOIL process. For us, we focused primarily on data that was already made public on a regular basis via PDFs. For example, monthly performance metrics for MTA board and committee materials were only shared in PDFs, but the open data program allowed us to automate and share that data in an open format to supplement the monthly PDF releases.

A small but critical benefit of focusing open data efforts on data that is already public in some form is that it allows for progress in the absence of a formalized data governance policy. Getting your first datasets in a machine-readable format up on a portal will reveal the gaps in your technical processes. Being able to figure out the technical details and process, without simultaneously having the complicated conversations of whether the data is okay to release in the first place, is an approach that worked well for us.

Engaging the community

The importance of engaging the open data community by building relationships with key stakeholders and advocates cannot be understated! These organizations will be your biggest champions while also holding your organization accountable in a way that sustains momentum. Advocates are important external validators, and if they can provide positive feedback on your first open data releases, it will help build momentum internally for your open data program.

If open data advocates don't exist in your space, you could take the initiative to generate excitement and communicate your work. To start, you could participate in or host public-facing events about open data, partner with an existing local civic tech group, or work with the press office to generate positive coverage on your first open data release.

You should not expect all advocates to have the desire to work directly with open datasets you publish; doing some work to invest in data visualizations tools (built off open data, of course!) will go a long way. Accompanying visualizations increase the visibility of your program and secure advocates who operate more in the policy and communications sphere. We put quite a bit of work into building our data visualization tool metrics.mta.info early on, which visualizes open datasets in a way that is easily digestible for people of all technical skill levels.

Finally, let data users connect directly with the person and/or team managing your program. A team inbox ([email protected]) has worked well for us.

Improvements after a successful first year

By the end of year one, you should have a few key datasets published to an open data portal, proved the value of open data to your organizations, and built relationships with a few key stakeholders both internal and external to your agency.

After our first year of the program, we were still manually uploading every single open dataset to the open data portal. We desperately needed the expertise of a data engineer to scale the program in a sustainable way. But we needed to take those first steps of sharing data openly to know what exactly we needed. All that's to say, don't let a lack of technical skills inhibit you from seeing what you can accomplish. We were still able to publish a lot of datasets in our first year of the program - and now we can publish even more.

We hope this post has given some concrete ideas on where to start when it comes to spinning up an open data program at your agency or organization. And for those reading because you hope to see a public agency in your city publish more open data, we encourage you to approach your advocacy efforts with understanding and optimism as the organization builds its capacity and resources to change the status quo.

About the author

Lisa Mae Fiedler manages MTA's Open Data program. She is thrilled to celebrate NYC Open Data Week later this month-you can check out the calendar of events here.

MTA - Metropolitan Transportation Authority published this content on March 10, 2026, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on March 10, 2026 at 21:04 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at [email protected]