Organic blue and green shapes

GitHub Copilot Trust Center

We enable developers and organizations to maximize their potential by prioritizing security, privacy, compliance, and transparency as we develop and iterate on GitHub Copilot.

GitHub Copilot Trust Center

Welcome to the GitHub Copilot Trust Center, we are excited you are here.

GitHub Copilot and AI

AI coding tools are already reshaping software development, and AI’s role in coding will only continue to grow. Here’s what we’ve learned so far.

Security

GitHub Copilot and security

GitHub Copilot uses top-notch Azure infrastructure and encryption, and an AI-based vulnerability prevention system that blocks insecure coding patterns in real-time.

Secure transmission/encryption

GitHub Copilot transmits data to GitHub’s Azure tenant to generate suggestions, including both contextual data about the code and file being edited (“prompts”) and data about the user’s actions (“user engagement data”). The transmitted data is encrypted both in transit and at rest; Copilot-related data is encrypted in transit using transport layer security (TLS), and for any data we retain at rest using Microsoft Azure’s data encryption (FIPS Publication 140-2 standards).

Third-party testing/certification

  • Audits and Certifications: Copilot is not currently included in GitHub’s existing audits and certifications, including SOC 2, ISO 27001, and FedRAMP Tailored. Compliance at GitHub begins with good security, so our first focus is fully onboarding Copilot to GitHub security programs and tooling. GitHub is engaging with a third-party audit firm to perform a gap assessment of Copilot as part of readiness activities for SOC 2 Type 1 (security criteria) and ISO 27001, with a goal of having the full audits for code completion by May 2024, and will start onboarding new Copilot GA’d functionality on a 6 month cycle starting in November 2024.

  • External Penetration Test: GitHub can provide, under NDA to our current Enterprise customers, a third-party penetration and application test report from the assessment performed on GitHub Copilot for Business. Additionally, GitHub Copilot is in scope for GitHub’s Bug Bounty program.

How can you help

  • You can help by using GitHub Copilot and sharing feedback in the feedback forum. Please also report incidents (e.g., offensive output, code vulnerabilities, apparent personal information in code generation) directly to copilot-safety@github.com so that we can improve our safeguards. GitHub takes safety and security very seriously and we are committed to continually improving. 

  • Copilot is included in the GitHub Bug Bounty program. Copilot submissions are triaged and processed through the existing bug bounty workstreams.

How GitHub Copilot aids secure development

  • As suggestions are generated and before they are returned to the user, Copilot applies an AI-based vulnerability prevention system that blocks insecure coding patterns in real-time to make Copilot suggestions more secure. Our model targets the most common vulnerable coding patterns, including hardcoded credentials, SQL injections, and path injections.

  • The system leverages LLMs to approximate the behavior of static analysis tools and can even detect vulnerable patterns in incomplete fragments of code. This means insecure coding patterns can be quickly blocked and replaced by alternative suggestions. 

  • The best way to build secure software is through a secure software development lifecycle (SDLC). GitHub offers solutions to assist with other aspects of security throughout the SDLC, including code scanning (SAST), secret scanning, and dependency management (SCA). We recommend enabling features like branch protection to ensure that code is merged into your codebase only after it has passed your required tests and peer review.

How GitHub Copilot works with other security measures

  • Proxies for filtering, e.g., PII: Outbound requests contain a prompt which is made up of code in the currently edited file and related files. If this request is dropped, then Copilot will fail to provide a completion and may show an error message. If the request is modified through operation of a proxy filter that removes personal information or questionable content or code, then Copilot is able to process the request as normal.

  • Air-gapped environments. GitHub Copilot for Business requires an active internet connection between a user’s IDE and the GitHub Copilot Proxy service. As a result, it does not work in air-gapped environments.

Limitations of GitHub Copilot

While our experiments have shown that GitHub Copilot suggests code of the same or better quality than the average developer, we can’t give any assurance that the code is bug free. Like any programmer, Copilot may sometimes suggest insecure code. We recommend taking the same precautions you take with the code written by your engineers (linting, code scanning, IP scanning, etc.)

Resources

Privacy

GitHub Copilot and privacy

Your privacy is paramount. We're committed to handling your data responsibly, while delivering an optimal GitHub Copilot experience.

What personal data does GitHub Copilot process?

GitHub Copilot processes personal data based on how Copilot is accessed and used: whether via github.com, mobile app, extensions, or one of various IDE extensions, or through features like suggestions for the command line interface (CLI), IDE code completions, or personalized chat on GitHub.com. The types of personal data processed may include:

  • User Engagement Data: This includes pseudonymous identifiers captured on user interactions with Copilot, such as accepted or dismissed completions, error messages, system logs, and product usage metrics. 

  • Prompts: These are inputs for chat or code, along with context, sent to Copilot's AI to generate suggestions. 

  • Suggestions: These are the AI-generated code lines or chat responses provided to users based on their prompts. 

  • Feedback Data: This comprises real-time user feedback, including reactions (e.g., thumbs up/down) and optional comments, along with feedback from support tickets. 

How does GitHub use the Copilot data?

How GitHub uses Copilot data depends on how the user accesses Copilot and for what purpose. Users can access GitHub Copilot through the web, extensions, mobile apps, computer terminal, and various IDEs (Integrated Development Environments). GitHub generally uses personal data to:

  • Deliver, maintain, and update the services as per the customer's configuration and usage, to ensure personalized experiences and recommendations

  • Troubleshoot, which involves preventing, detecting, resolving, and mitigating issues, including security incidents and product-related problems, by fixing software bugs and maintaining the online services' functionality and up-to-dateness

  • Enhance user productivity, reliability, effectiveness, quality, privacy, accessibility, and security by keeping the service current and operational

These practices are outlined in GitHub’s Data Protection Agreement (DPA), which details our data handling commitments to our data controller customers. 

GitHub also uses certain personal data with customer authorization under the DPA, for the following purposes:

  • Billing and account management

  • To comply with and resolve legal obligations 

  • For abuse detection, prevention, and protection, virus scanning, and scanning to detect violations of terms of service

  • To generate summary reports for calculating employee commissions and partner incentives

  • To produce aggregated reports for internal use and strategic planning, covering areas like forecasting, revenue analysis, capacity planning, and product strategy,

For details on GitHub's data processing activities as a controller, particularly for Copilot Individual customers, refer to the GitHub Privacy Statement.

What is GitHub’s Processing Role for Copilot Business and Enterprise data (controller or processor)?

Data Processor

GitHub acts primarily as a data processor in providing the Copilot Business and Enterprise services. In that capacity, GitHub uses personal data on behalf of our customers (the data controllers):

  • To deliver, maintain, and update the services as per the customer's configuration and usage, to ensure personalized experiences and recommendations

  • To troubleshoot, which involves preventing, detecting, resolving, and mitigating issues, including security incidents and product-related problems, by fixing software bugs and maintaining the online services' functionality and up-to-dateness

  • To enhance user productivity, reliability, effectiveness, quality, privacy, accessibility, and security by keeping the service current and operational

GitHub’s data handling commitments for Copilot Business and Copilot Enterprise are in GitHub’s Data Protection Agreement (DPA). 

Data Controller

After authorization through a DPA, GitHub may also process some personal data as a data controller. This is a complete list of those purposes: 

  • For billing and account management

  • To generate summary reports for calculating employee commissions and partner incentives

  • To comply with and resolve legal obligations 

  • For abuse detection, prevention, and protection, virus scanning, and scanning to detect violations of terms of service

  • To produce aggregated reports for internal use and strategic planning, covering areas like forecasting, revenue analysis, capacity planning, and product strategy,

The details on the precise data involved depends on the access method and purpose. Users can access GitHub Copilot through the web, extensions, mobile apps, various IDEs (Integrated Development Environments), and features like CLI (Command Line Interface) chat, IDE code completions, or personalized chat on GitHub.com. For more information on GitHub’s processing as a data controller (e.g., Copilot Individual customers), see the GitHub Privacy Statement.

Does GitHub Copilot support compliance with the GDPR and other data protection laws?

Yes. GitHub and customers can enter into a Data Protection Agreement that supports compliance with the GDPR and similar legislation.

Does GitHub use Copilot Business or Enterprise data to train GitHub’s model?

No. GitHub uses neither Copilot Business nor Enterprise data to train its models. 

How long does GitHub retain Copilot data for Business and Enterprise customers?

If and for how long GitHub’s retains Copilot data depends on how a Copilot user accesses Copilot and for what purpose. The default settings for Copilot Business and Enterprise Customers are as follows: 

Access through IDE for Chat and Code Completions:

  • Prompts and Suggestions: Not retained

  • User Engagement Data: Kept for two years.

  • Feedback Data: Stored for as long as needed for its intended purpose.

All other GitHub Copilot access and use:

  • Prompts and Suggestions: Retained for 28 days.

  • User Engagement Data: Kept for two years.

  • Feedback Data: Stored for as long as needed for its intended purpose.

Does GitHub Copilot use automated decision-making in the meaning of GDPR Article 22?

No. GitHub Copilot does not subject individuals to significant decisions based only on automated processing.

Does GitHub Copilot use third-party subprocessors?

Yes. GitHub shares data with third parties acting as our subprocessors (as defined in the GDPR) to support operations. Any subprocessors to which GitHub transfers personal data will have entered into written agreements with GitHub that are no less protective than the GitHub Data Protection Agreement. All third-party subprocessors are listed at https://docs.github.com/en/site-policy/privacy-policies/github-subprocessors

GitHub Copilot and data flow

How is the data flowing and what is being done with it?

IP and Open Source

GitHub Copilot and copyright

Respecting intellectual property rights is an important part of the software development process. Learn about code ownership, filtering, and public code use here.

Does GitHub Copilot “copy/paste”?

No, GitHub Copilot generates suggestions using probabilistic determination.

  • When thinking about intellectual property and open source issues, it is critical to understand how GitHub Copilot really works. The AI models that create Copilot’s suggestions may be trained on public code, but do not contain any code. When they generate a suggestion, they are not “copying and pasting” from any codebase.  

  • To generate a code suggestion, the Copilot extension begins by examining the code in your editorfocusing on the lines just before and after your cursor, but also information including other files open in your editor and the URLs of repositories or file paths to identify relevant context. That information is sent to Copilot’s model, to make a probabilistic determination of what is likely to come next and generate suggestions.  

  • To generate a suggestion for chat in the code editor, the Copilot extension creates a contextual prompt by combining your prompt with additional context including the code file open in your active document, your code selection, and general workspace information, such as frameworks, languages, and dependencies. That information is sent to Copilot’s model, to make a probabilistic determination of what is likely to come next and generate suggestions.

  • To generate a suggestion for chat on GitHub.com, such as providing an answer to a question from your chat prompt, Copilot creates a contextual prompt by combining your prompt with additional context including previous prompts, the open pages on GitHub.com as well as retrieved context from your codebase or Bing search. That information is sent to Copilot’s model, to make a probabilistic determination of what is likely to come next and generate suggestions. 

What are the intellectual property considerations when using GitHub Copilot?

The primary IP considerations for GitHub Copilot relate to copyright. The model that powers Copilot is trained on a broad collection of publicly accessible code, which may include copyrighted code, and Copilot’s suggestions (in rare instances) may resemble the code its model was trained on. Here’s some basic information you should know about these considerations:

  • Copyright law permits the use of copyrighted works to train AI models:  Countries around the world have provisions in their copyright laws that enable machines to learn, understand, extract patterns, and facts from copyrighted materials, including software code. For example, the European Union, Japan, and Singapore, have express provisions permitting machine learning to develop AI models. Other countries including Canada, India, and the United States also permit such training under their fair use/fair dealing provisions. GitHub Copilot’s AI model was trained with the use of code from GitHub’s public repositorieswhich are publicly accessible and within the scope of permissible copyright use.

  • What about copyright risk in suggestions? In rare instances (less than 1% based on GitHub’s research), suggestions from GitHub may match examples of code used to train GitHub’s AI model. Again, Copilot does not “look up” or “copy and paste” code, but is instead using context from a user’s workspace to synthesize and generate a suggestion. Our experience shows that matching suggestions are most likely to occur in two situations: (i) when there is little or no context in the code editor for Copilot’s model to synthesize, or (ii) when a matching suggestion represents a common approach or method. If a code suggestion matches existing code, there is risk that using that suggestion could trigger claims of copyright infringement, which would depend on the amount and nature of code used, and the context of how the code is used. In many ways, this is the same risk that arises when using any code that a developer does not originate, such as copying code from an online source, or reusing code from a library. That is why responsible organizations and developers recommend that users employ code scanning policies to identify and evaluate potential matching code.

What about open source license considerations?

When a developer uses code made available under an open source software license, they may have to meet license requirements, such as attributing the author of the code, disclosing source code that makes use of open source code, or distributing the code under certain licenses.  If these requirements are not met, the owner of the code could assert claims including copyright infringement or breach of the applicable open source license. 

  • Does a suggestion that matches code automatically trigger copyright or open source considerations? No. The existence of matching code does not itself dictate whether the concerns and their legal risk exist. Whether and when these considerations may apply depends on many factors, including the quantity and nature of the open source code used, and the specific open source license applicable to such code. As with any code that your developers did not originate, the decision about when, how much, and in what context to use any code is one your organization needs to make based on its policies, and in consultation with industry and legal service providers. All organizations should maintain appropriate policies and procedures to ensure that these licensing concerns are properly addressed, as described below. 

Discussing all possible concerns and safeguards around open source is beyond the scope of this document. If your organization is using GitHub Copilot, however, you are likely already developing code, policies, and procedures around open source. You should apply them equally to code suggested by Copilot.

  • Each organization is responsible for setting its open source policies and procedures. 

Does GitHub Copilot include a filtering mechanism to mitigate risk?

Yes, GitHub Copilot does include an optional code referencing filter to detect and suppress certain suggestions that match public code on GitHub.

  • GitHub has created a duplication detection filter to detect and suppress suggestions that contain code segments over a certain length that match public code on GitHub. This filter can be enabled by the administrator for your enterprise and it can apply for all organizations within your enterprise, or the administrator can defer control to individual organizations. 

  • With the filter enabled, Copilot checks code suggestions for matches or near-matches against public code on GitHub of 65 lexemes or more (on average,150 characters). If there is a match, the suggestion will not be shown to the user.

In addition to off-topic, harmful, and offensive output filters, GitHub Copilot also scans the outputs for vulnerable code.

Does GitHub Copilot include features to make it easier for users to identify potentially relevant open source licenses for matching suggestions?

Yes, GitHub Copilot is previewing a code-referencing feature to assist users to find and review potentially relevant open source licenses. The code-referencing feature is currently in preview.

  • If a suggestion matches publicly available code on GitHub, maintain consistency with your organization’s open source policies and procedures. A prudent and responsible step should include investigating available information to determine whether to use a suggestion.

  • GitHub Copilot’s code referencing feature identifies suggestions that contain exact matches or near matches to public code. When a match is located, Copilot provides an alert that includes links to repositories for any such matching code, along with any available information on applicable software licenses, and logs this information. Copilot users can review this information to determine whether the applicable suggestions are suitable for use, and whether additional measures may be necessary to use them.

  • Copilot users can also use this feature as a tool for learning. Using the information provided by the code referencing feature, a developer might find inspiration from other codebases, discover documentation, and almost certainly gain confidence that this fragment is appropriate to use in their project. They might take a dependency, provide attribution where appropriate, or possibly even pursue another implementation strategy. By helping developers understand the community context of their code in a manner that also preserves developer flow, we believe Copilot will continue to deliver responsible innovation and true happiness at the keyboard.

Is GitHub Copilot intended to fully automate code generation and replace developers?

No. Copilot is a tool intended to make developers more efficient. It’s not intended to replace developers, who should continue to apply the same sorts of safeguards and diligence they would apply with regard to any third-party code of unknown origin.

  • The product is called “Copilot” not “Autopilot” and it’s not intended to generate suggestions without oversight. You should use exactly the same sorts of safeguards and diligence with Copilot’s suggestions as you would use with any third-party code.

  • Identifying best practices for use of third party code is beyond the scope of this section. That said, whatever practices your organization currently uses – rigorous functionality testing, code scanning, security testing, etc. – you should continue these policies with Copilot’s suggestions. Moreover, you should make sure your code editor or editor does not automatically compile or run generated code before you review it.

Can GitHub Copilot users simply use suggestions without concern?

Not necessarily. GitHub Copilot users should align their use of Copilot with their respective risk tolerances. 

  • As noted above, GitHub Copilot is not intended to replace developers, or their individual skill and judgment, and is not intended to fully automate the process of code development. The same risks that apply to the use of any third-party code apply to the use of Copilot’s suggestions. 

  • Depending on your particular use case, you should consider implementing the protections discussed above. It is your responsibility to assess what is appropriate for the situation and implement appropriate safeguards. 

You’re entitled to IP indemnification from GitHub for the unmodified suggestions when Copilot’s filtering is enabled. If you do elect to enable this feature, the copyright responsibility is ours, not our customers. As part of our ongoing commitment to responsible AI, GitHub and Microsoft extends our IP indemnity and protection support to our customers who are empowering their teams with GitHub Copilot. Details here.

How does GitHub Copilot use your code to provide suggestions?

GitHub Copilot provides suggestions based on the context of what you’re working on in your code editor. This requires temporarily transferring an ephemeral copy of various elements of that context to GitHub’s servers.

  • Generative AI tools provide responses to something generically called a “prompt.” In the case of GitHub Copilot, the prompt consists of various elements from your code editor. This may include file content both in the file you’re editing, as well as neighboring or related files within a project. It may also include URLs of repositories or file paths to identify relevant context. The comments and code, along with this context, are then used to synthesize and suggest individual lines of code and entire functions. 

  • The prompt needs to be transferred to GitHub’s servers for processing. The transmitted data is encrypted, both in transit and at rest. 

  • Copilot interacts with its model, which is hosted on Microsoft’s Azure service, to generate suggestions. These suggestions are then transmitted back to the user. As above, the suggestions are encrypted in transit and at rest. 

  • Prompts are transmitted in real-time only to return suggestions. If you are using the Copilot extension in the code editor, your prompt, suggestion, and supporting context will be discarded. If you are using Copilot outside the code editor, your prompt, suggestion, and supporting context will be stored for 28 days.

Does GitHub Copilot retain any of your prompts that it used as a basis for providing suggestions?

The GitHub Copilot extension in the code editor does not retain your prompts for any purpose after it has provided Suggestions, unless you are a Copilot Individual subscriber and have allowed GitHub to retain your prompts and suggestions. 

  • As noted above, Copilot does transfer content from your code editor to GitHub’s servers for purposes of assessing the context and providing suggestions. The transferred copy is purely ephemeral and, shortly after Copilot has provided suggestions, the copy is deleted. It is not used for any other purpose. 

GitHub Copilot offerings outside of the code editor extension, such as in the CLI, do retain your prompts and suggestions in order to provide the service. For more information please review the Privacy section.

Does GitHub Copilot use any of your code to train the GitHub's model (or any successor model)?

No. GitHub uses neither Copilot Business nor Enterprise data to train the GitHub model.

Who owns the suggestions provided by GitHub Copilot?

We don’t determine whether a suggestion is capable of being owned, but we are clear that GitHub does not claim ownership of a suggestion.

  • Whether a suggestion generated by an AI model can be owned depends on many factors (e.g. the intellectual property law in the relevant country, the length of the suggestion, the extent that suggestion is considered ‘functional’ instead of expressive, etc).

  • If a suggestion is capable of being owned, our terms are clear: GitHub does not claim ownership.

  • In certain cases, it is possible for Copilot to produce similar suggestions to different users. For example, two unrelated users both starting new files to code the quicksort algorithm in Java will likely get the same suggestion. The possibility of providing similar suggestions to multiple users is a common part of generative AI systems. Regardless, GitHub does not claim ownership.

Labor Market

GitHub Copilot and the future of work

GitHub Copilot isn’t made to replace developers—it’s here to enhance their work and make the industry more inclusive, too.

Does GitHub Copilot empower developers and enhance productivity?

Bringing in more intelligent systems has the potential to bring enormous change to the developer experience. We do not expect GitHub Copilot to replace developers. Rather, we expect GitHub Copilot to partner with developers, augment their capabilities, and enable them to be more productive, reduce manual tasks, and help them focus on interesting work. 

How does GitHub Copilot support an inclusive development catalyst?

Early research on generative AI and GitHub Copilot specifically find that these tools have the potential to lower barriers to entry and enable more people to become software developers. GitHub supports programs to expand access to Copilot (including making it free for students) and other developer tools in an effort to help those interested in joining the industry.

How does GitHub Copilot transform developer opportunities?

Advances in developer productivity are nothing new. While AI may change your workflow, history offers many examples of how jobs evolve and adapt, often creating more opportunities in the process. Compilers, high-level programming languages, open source software, IDEs—the list of advances that have changed how developers work is long and ever-expanding. The data shows that, over time, these tools have lowered costs of software development while dramatically increasing demand for software and developers. Here’s the result: According to the US Bureau of Labor Statistics, there are more developers than ever before and they are paid more, too (even after adjusting for inflation).

Accessibility

GitHub Copilot and accessibility

GitHub is committed to empowering developers with disabilities to help build the technologies that drive human progress.

What is GitHub's mission and goal regarding developer collaboration and accessibility for people with disabilities?

At GitHub, our mission is to accelerate human progress through developer collaboration. We believe people with disabilities should benefit from and be able to contribute to the creation of that progress. 

Our goal is to empower developers with disabilities to build on GitHub. In doing so, we collectively increase access to technology for all people with disabilities. This includes access to our AI pair programmer, GitHub Copilot, which improves developer productivity and happiness.

What accessibility standards does GitHub use?

While developing products including GitHub Copilot, we take into account leading global accessibility standards, which include:

  • Web Content Accessibility Guidelines (WCAG)

  • U.S. Section 508 

  • EN 301 549

How does GitHub include accessibility in the development process?

In addition to accessibility standards, the development and iterative improvement of our products is guided by the lived experiences of people in our community with disabilities. We host regular internal accessibility office hours that provide direct feedback to designers and developers.

Accessibility is also integrated into our development processes through design checklists, linting, code inspection, automated accessibility scanning, and manual testing. GitHub Copilot for Visual Studio and Visual Studio Code leverages the native accessibility features of those Integrated Development Environments.

How does GitHub test accessibility?

Our internal accessibility audits are performed by testers that have been certified through the U.S. Department of Homeland Security (DHS) Trusted Tester Program. The Trusted Tester program creates a common testing approach, including code and UI inspection-based tests for determining software and website accessibility compliance and conformance to accessibility standards. Our internal audit process also includes testing by people with disabilities.

Where can users find information about GitHub accessibility?

GitHub Copilot and contracts

There are a few documents that govern your use of GitHub Copilot. Learn about them here.
GitHub Copilot product specific terms

Constantly improving

We're always working hard to improve our products. For questions about updates or changes, please reach out to a GitHub representative.

Contact us