AllAi Security and Privacy - FAQ
Monica Pastor
There will be key categories of data that will be stored and used by the AllAi solution:
Access data: email and name using Auth0
Telemetry data: usage frequency by user
Chat threads: only the ones selected by the users.
Customer documentation, guides and requirements: via the selected Confluence spaces (optional)
Customer code: selected Bitbucket code repositories (optional)
Data is stored in an Atlas database, and vectors on Milvus DB’s (only vectors, not data/metadata). Both our Atlas and Milvus accounts run on AWS cloud (on us-east).
We do not use customer data for model training, improvements, etc. Data is deleted as soon as the client cancels the contract or actively opts to have it deleted
Allowed users (SSO setup) via the applications part of AllAi
Direct access: AllAi Security and Audit Team.
Data about your customer, which resides in Salesforce, remains in Salesforce and is never transferred to or through the LLM.
Metadata, such as code, object structures, and similar elements, is stored.
For data storage, in a code repository (GitHub): standard access restrictions apply, meaning only the team members allocated to the project have access to it.
Regarding data storage in a private embedding index (Milvus): by private we mean that the index is accessible solely by the subscriber who created it. The subscriber is protected through Auth0 for both creation and retrieval processes. Retrieval is safeguarded before reaching the LLM.
As for the data processor, the relevant segments of metadata, combined with the Developer’s Prompt, are sent directly to one of two OpenAI endpoints (
/v1/chat/completions or /v1/completions
), both of which adhere to OpenAI's zero-retention policies.
Customer queries (and the associated outputs) are considered confidential information and OSF agrees to protect such information under our agreement
We securely store user requests and responses to provide users with access to their conversation history in certain services, such as Chat. The queries entered by users and the answers provided by AllAi are not used by OSF to further train the AI, except for the customer's benefit when agreed upon, nor will they be shared or become public information.
Note that if the user utilizes the code indexing feature (available in the Pro and Enterprise plan), we host the indexed data. The servers hosting this data can only be accessed through the application's IPs within a restricted Virtual Private Cloud (VPC). Even for this specific feature, where we host the indexed data, we do not use it to train the AI, and it is not shared.
"Private" embeddings are secured via Auth0 authentication. We ensure secure user authentication through integration with Auth0, an identity management system.
We employ "ZeroCopy" policies for the LLM itself. As an enterprise customer of OpenAI, our agreement prohibits OpenAI from using input or outputs for any other purposes, including for training their AI. We adhere to the policy outlined in https://openai.com/enterprise-privacy , and we have explicitly opted out of allowing API data to be used for training or improving OpenAI's models.
We have implemented a protected data retrieval system to ensure the verification of private customer data within the workspace. We categorize information based on its sensitivity into two main types: public data, such as Salesforce documentation and public GitHub repositories, and other data categories. Public data sources are accessible to everyone, whereas company-specific data is private and limited to their specific database. When a customer decides to leave, we delete their database after a period that is defined in terms of use.
We use encryption techniques to secure data and make it unreadable to unauthorized users. Our database is encrypted using industry standards (We use Atlas DB for managing databases) and TLS encryption for prompts via HTTPS.
Additionally, we offer a toxicity detection system that can be tailored to specific use cases. But by default, we utilize OpenAI's abuse monitoring system.
None of the current use cases we are supporting are processing PII and in case there is PII data in the sources we are indexing we can add a “clean up” step during our indexing process.
That will be an extra protection step we take as no data is shared with other orgs or used for re-training
Milvus only contains Embeddings (a numerical projection of the text into space). Embeddings contain semantic information, but not enough to recreate the original text.Milvus | High-Performance Vector Database Built for Scale
Atlas contains the human readable chunks.:https://www.mongodb.com/atlas/database. There are three level of data granularity involved:
Documents (human readable full content)
Chunks (human readable, fractioned and cleansed content): Contains a leaser copy of the original
Embeddings. (Numerical Vector in high dimensional space, can not work back to the original document)
There are 4 main products involved:
Parsers (AWS ECS hosted back-ends)
Vector Database (Milvus)
Database (Atlas)
LLM (OpenAI)
The complete process is as follows:
A. (data processing) (AWS) First the Code and Documents are broken down into Chunks (paragraphs and functions) that are still human readable. This happen in an backend hosted in AWS Elastic Container Service.
B. (data processing) (OpenAI) The chunks are then sent to OpenAI's embedding API which answers the Embeddings. New embedding models and API updates
C.1. (data storage) (atlas) The chunk themselves are then stored to Atlas, a different database is created for each clients. But all Seats from that client will share the same.
C.2. (data storage) (Milvus) The embedding themselves are then stored to Milvus, a single cluster is used for all clients, in which a different collection is created by client/seat.
D. (data processing) (Milvus) Upon User interaction with the AllAi, the user input (prompts) are sent through an AWS ECS backend (protected by an Auth0 access token) to Milvus for Vector Search. Milvus answer a chunk id.
E. (data processing) (atlas) Upon receiving chunk ids, the AWS ECS backend retrieve the chunks from Atlas and send them back to the end user. (AllAi Code extension, AllAi Chrome Extension) or to an AWS ECS backend (AllAi Chat)
F. (data processing) (OpenAI) the End user then send a new user input (prompt) composed of those chunks to the LLM. The LLM generate the answer.
G. (data storage) (atlas) Measures (token count, usage frequency) are stored by in the atlas database for quality and system control.
As for the specific Zero Copy (or Zero Retention) this is done through a choice of API usage as listed on https://platform.openai.com/docs/models/default-usage-policies-by-endpoint part of OpenAI's offering store information locally, part of it don't, we have built AllAi to rely on API that supports the Zero Copy policies namely: /v1/chat/completion
and /v1/embeddings
from the description above
For "Private Knowledge" there is the notion of Org and Seats
A) the chunks are stored in the atlas (the one created for the org)
B) the embeddings are stored in milvus
C) access to both is gated through an AWS ECS hosted backend that relies on an Auth0 access token.
D) The platform currently supports two level of access for those "Private Knowledge"
D.1) User level: Only the Seat who created it have access to it, the user level access is enforced by the AWS ECS backend
D.2) Org Level: All Seats in the Org have access to it.
OSF have implemented compliance controls for our teams meet their compliance obligations and adhere to industry regulations, depending on the nature of the product and the data it handles. (PO-OSF-09_A4_Security Statements for OSF IT Systems processing PII.docx )
We have abbreviated technical and privacy documentation for our customers, detailing our compliance with GDPR, and guides to help enable secure and compliant use of our products and services. Company emails are stored by OSF middleware during the usage contract time to collect behavioral analytics. On customer request after termination, OSF can delete these emails they can be deleted by request.
The same guidelines that typically apply to maintaining intellectual property in a code repository (GitHub) are enforced. This ranges from code isolation between teams to best practices for not storing credentials and secrets within the code.
There is a human in the loop when code is produced, which reduces the risk of unsupervised malicious code generation. The standard delivery process includes a code review by a senior developer, thereby reducing the risk of unintentionally deploying malicious code.
We also maintain internal policies to assess outputs for accuracy, biases, appropriateness, and usefulness before relying on and utilizing them. (MC-OSF-03_P08_AI Code of Conduct.docx)
The codebase is not stored in the shared LLM; rather, it is kept in a "private" knowledge base. Access to these is restricted to the team assigned to the project. The retrieval of the codebase occurs before it reaches the LLM for generation. We have some default data sources (Salesforce Docs, B2C Commerce, etc.) that are shared with everyone, but custom data that is specific to a company will remain private and scoped exclusively to that company.
Additionally, a single "Private" knowledge base can be activated at any given time, which prevents developers assigned to multiple projects from unintentionally merging the information during generation.
Queries that users enter and the responses provided by AllAi are not being used by OSF to further train the AI, nor will they be shared or become public information. There are some initiatives, agreed upon with the customer, that use their code to improve AllAi's specialization, but this is done through a back-channel agreement and is not associated with the AllAi productivity enhancements.
The main, and most effective safeguard remains the human in the loop and the traditional deployment processes requiring Code Review. Malicious attempt at modifying or injecting into the generated code base would requires bypassing the vigilance of the Developer (human in the loop) and Code Review processes.
Malicious Actors with access to the LLM still don’t have access to the Codebase as it is stored in a “private” knowledge base.
Malicious attempt at reverse understanding the codebase would require access to (Milvus) which is shielded by single user authentication.
There is also a per-client authentication where Retrieval operations from RAG are restricted to the content indexed by the user, preventing information leakage. Retrieval happening before ever hitting the LLM to be less subject to typical injections.
For protection against prompt injection, we primarily rely on OpenAI's security measures. Nevertheless, it is important to acknowledge that there is no method to completely safeguard against potential prompt injections in certain services, such as Chat. Non-chat applications such as inline code completion and Devops are significantly less susceptible to these risks. This is due to the fact that users do not have the capability to iteratively prompt the system, which effectively reduces the likelihood of most prompt attack strategies.
Our development process incorporates a suite of specialized linters tailored to each programming language to ensure code quality and consistency.
Furthermore, we leverage SonarQube and SonarCloud to conduct comprehensive scans across the codebase, enhancing our toolset's capabilities. Every pull request (PR) undergoes a meticulous review by our seasoned Solution Architects (SAs) and is subject to a thorough evaluation by our Quality Assurance (QA) team.
We safeguard our supply chain by regularly checking all dependencies for security issues using OWASP, and ensuring our direct dependencies comply with open source licenses through FOSSA.
Complementing these measures, we deploy our advanced AI-powered bug detection system, which not only identifies and rectifies issues proactively but also provides valuable insights to streamline the PR review process.
Only our key services can be reached through APIs. They are secured using HTTPS with TLS (Transport Layer Security) for encryption in transit. The authentication is handled by API keys, OAuth tokens, or JWT (JSON Web Tokens). Authentication via OAuth 2.0, SAML, or OpenID Connect. Encryption in transit and at rest, ensuring data is always protected. Internal services use local vpc-protected data transfer.
AllAI Code: Source code (stored, but it can be configured to be customer hosted)
Jira: issue data, and comments (processed, but not stored).
Chat: Message content, file attachments.
DevOps: code, commends and commit messages (processed, but not stored).
Data is secured using encryption. Upon contract termination, data can be automatically deleted (multi-tenancy customer data protection). Procedures for data deletion should comply with GDPR and other data protection regulations.
We store only the required data for powering up such a feature: raw content and content-indexed content. Indexing data resides within a secure database or a dedicated search index, both protected similarly
Enterprise versions often provide enhanced security features, such as dedicated hosting environments, advanced encryption options, and more granular access controls. Data handling policies in the Enterprise version may be more customizable to meet specific business needs.
It will be configurable, and frequency will be set considering the use case and cost. Example: have daily updates and in case critical business cases are supported it can be every minute (almost live).
SOC 2 certification is currently in progress but is not yet available
The main use case are RAG (retrieval augmented generations).
The user input are augmented and transformed to better perform as query against the retrieval systems. The query are processed lexically and semantically for document retrieval. The documents are combined and summarized to as part of the generation process. The generations are then compare with LLM-as-Judge additional process for measure and evaluation purposes.
The generated output is then presented to the end user.
User-Owned "Permanant Threads" contains each step involved in the RAG (retrieval augmented generation) process. Including the augmented search query, the retrieve documents, the considered document, the various similarity and relevance scoring. Part of those information are presented through UI to the end user to allow Human-In-The-Loop assessment of the answer, part of those information are kept hidden for troubleshooting.
This data is attached to the thread itself and is deleted upon thread deletion requests.
Open source: OpenAI gpt-4o, o3-mini, text-embedding-3-small
We provide reporting capabilities that enable administrators to monitor product usage and assess the quality of the inputs.
We ensure that the AI system is updated and maintained through continuous development. Various use cases and domain Expertise are periodically scored against a set of private evaluation datasets
Continuous development include continues UAT phases, security scans and reports, methodology updates.
We reevaluate the system at each model release to address, and create additional evaluation dataset as the Expertise evolve and the new risks emerge.
Our updates are aligned with the release cycle of OpenAI models, ensuring that we stay current with the technologies This approach allows us to integrate new features, improve performance, and enhance safety measures as they become available.
All our services possess the capability of auto-scaling. Hence, whenever a service is facing difficulties in handling the amount of requests, we quickly spawn more servers to aid handling such bursts of requests. It all happens in a matter of a couple of minutes.
Additionally, we have a dedicated sets of workers to scale horizontally when necessary to process large amounts of data. The entire pipeline can run in parallel, and we can leverage of AWS infrastructure to process everything very quickly.
Our infrastructure can scale to thousands of users in theory, but another limit would be OpenAI themselves. OpenAI only allows 10k requests per minute. So our theoretical limit is like 5 thousand users all using the chat at once, assuming every user sends 2 chat messages per minute on average
The limit is shared between all apps.
check MC-OSF-03_P08_AI Code of Conduct
OSF Digital adheres to several best practices and guidelines for Human-AI system interaction:
Transparency: OSF Digital is committed to being transparent about the use of AI systems, including informing users about the purpose, data usage, and decision-making processes of AI systems.
Privacy: AI systems are designed to comply with relevant data protection laws, ensuring appropriate security measures are in place to prevent unauthorized access to user data.
Fairness: OSF aims to design AI systems to be fair and unbiased, although it acknowledges that complete fairness cannot be guaranteed.
Human Oversight: AI systems are subject to human oversight and control, with humans involved in development and deployment, and able to intervene in decision-making processes.
Ethical Considerations: OSF considers the ethical implications of AI systems to ensure alignment with company values and principles.
Regarding AI-specific policies and guidelines, OSF has established best practices for using AI tools, both internally and in development efforts. These include:
Reviewing AI-generated material for accuracy and potential infringement.
Prohibiting the upload of OSF-owned data to public AI systems to avoid legal risks.
Ensuring AI-generated code undergoes human review and automated checks.
OSF also has AI training course in our internal LMS (Learning Hub)
Most of the log are centric to the AI system itself, informing where and when the features were used. Each extension point, including AllAi Jira Extension, AllAi GitHub Extension, AllAi VS Code Extension, AllAi Chrome Extension collect the version of the hosting app and the user agents for traceability and debug purposes
Most applicative logs are retrained for up to 90 days.
Each of the AllAi extension point, including AllAi Jira Extension, AllAi GitHub Extension, AllAi VS Code Extension, AllAi Chrome Extension, collect the log in isolation in separate log groups, each only accessible by a single product owner, offering only a partial view on the logs.
The Technical director role and above also have access to the same.
The datasources exposed to RAG (retrieval augmented generation) are divided in 3 type. "Platform" level information, containing only generally available information. This information is selected and vetted before indexation by the specialization team.
"Project" level information, which contain metadata, code and documentation pertaining to an Account. The availability of the fields, (field level minimization) are pre-filtered before indexation to minimize the information stored in the data sources, and the availability of the data sources themselves are restricted to their owner.
"Task" level information, which is ephemeral. Is presented to the system in a Zero Retention policy.
User-Owned "Permanant Threads" are self managed by the user and can be deleted at will
User-Owned "Temporary Threads" are also set with an initial expiry of 7 days, automatically deleted at expiry
User-Owned "Ephemeral Threads" exists for the duration of the RAG process and are discarded at completion.
Upon contract termination, data can be automatically deleted in compliance with GDPR and other data protection regulations. The procedures for data deletion are designed to ensure multi-tenancy customer data protection, meaning that each customer's data is handled separately and securely.
There is no automated data anonymization in place. The system is not training identifiable information.
We address the risk of bias by relying on OpenAI's robust measures to mitigate bias in their models. OpenAI employs various techniques and processes to ensure that their AI systems do not provide biased responses or discriminate against certain groups of people. While we do not have a formalized bias management process of our own, we trust in OpenAI's commitment to fairness and inclusivity, which is reflected in the design and training of their models.
Various use cases and domain Expertise are periodically scored against a set of private evaluation dataset.
RAGAS metrics, along with Faithfulness, Contextual Recall, Contextual Precision, and LLM-as-a-judge Custom Accuracy measure.
User-Owned "Permanant Threads" are stored in Atlas, encrypted at rest using AES-256
User-Owned "Ephemeral Threads" exists for the duration of the RAG process and are discarded at completion, they only persist in the Data Processing server's memory.
The User-Owner themselves have access to their own Threads.
Technical director role and above have access to the same
Systems responsible for holding data, such as Atlas, Milvus are configured with 3 way redundancy, periodical backup are also stored with georedundancy
System responsible for processing and interfacing the API have 2 way redundancy, a primary host and a fallback.