Skip to main content

CodeGenie: How Salesforce Leveraged Generative AI to Enhance Internal Developer Productivity

Shan Appajodu
Sep 10 - 7 min read

By Shan Appajodu and Katie Stasaski.

Salesforce has been a leader in AI technology for over a decade, continuously advancing from predictive AI to generative AI and now to autonomous AI. These developments are set to revolutionize the entire software development lifecycle. As an industry leader, Salesforce has chosen to develop its own technology, creating models specifically trained on our codebase. This approach is designed to support Salesforce-specific use cases and workflows, thereby enhancing the capabilities of our developers.

To further this initiative, Salesforce launched CodeGenie, an internal IDE-based tool aimed at boosting developer productivity. This tool is similar to Einstein for Developers, which provides external Salesforce developers with an IDE extension to aid in writing Apex and LWC code. To better serve internal developers, CodeGenie supports various programming languages and is trained on internal codebases. Insights from internal developers using CodeGenie enhance Einstein for Developers, which in turn benefits external developers, creating a mutually beneficial improvement cycle.

Since launching, CodeGenie has evolved beyond just an IDE tool; it now supports specific workflow use cases at every stage of the software development lifecycle (SDLC). Our ongoing efforts focus on developing agents that are deeply integrated with Salesforce’s SDLC, ultimately empowering internal developers to work more efficiently and effectively.

Enhancing IDE Productivity with CodeGenie’s Advanced Autocomplete Features

CodeGenie initially aimed to improve code completion within the IDE. As developers type, its purpose-built, low-latency AI model—specifically Salesforce AI Research’sCodeGen2.5 — provides autocomplete suggestions. This model processes the contents of the current file and utilizes context from related files to predict and generate the code that a developer is most likely to type next. The ability of CodeGen2.5 to “infill” allows it to consider the code surrounding the developer’s cursor, thereby generating relevant code at the precise location of typing.

To ensure the suggestions are both accurate and pertinent, CodeGen2.5 is fine-tuned using internal Salesforce repositories. This customization helps in tailoring the model’s responses to the specific coding practices and nuances of Salesforce’s environment.

Additionally, the decision to display code completions is driven by a logistic regression model. This model evaluates various factors, such as the contents of the current file and the developer’s past actions, to determine the likelihood that a user will want to see a completion suggestion. This predictive feature enhances the coding experience by providing assistance precisely when it’s most needed.

A CodeGenie autocomplete suggestion shown to a user as they type in a python file.

Enhancing Developer Support with Multi-Turn Chat and Advanced Code Assistance

While inline autocomplete and code generation significantly aid developers in coding, the development process encompasses a variety of other tasks. These include understanding existing code, refactoring, code review, and writing test cases, among others. To address these multifaceted needs, Salesforce has introduced a multi-turn chat interaction feature powered by Salesforce AI Research’s xGen model. The xGen-Code model, which powers CodeGenie chat, was trained on developer use cases and outperforms other open- and closed-source models. It excels in handling complex reasoning, multi-turn responses, and dynamic interactions. This allows developers to engage in dialogues with a chat agent, asking freeform text questions that are specific to their code repository.

The system is equipped with pre-set chat commands that enable developers to perform common tasks more efficiently. For more in-depth queries related to the repository, developers can invoke an agent that utilizes Retrieval-Augmented Generation (RAG). This technology extracts relevant code snippets, documentation, and project structure information, providing comprehensive support that extends beyond simple code generation to encompass a broader scope of development activities. This holistic approach not only streamlines the development process but also enhances the overall productivity and effectiveness of developers.

A developer invokes the @codebase CodeGenie agent, which retrieves relevant file snippets from the repository and includes as relevant context to the chat model.

Meeting Developers Where They Are: Streamlining the SDLC with AI Integration

Developers utilize a variety of tools to accomplish their tasks, and Salesforce aims to enhance their experience by integrating AI capabilities directly into these tools. Common touchpoints such as the GitHub console, Portals, CLI, and Slack are areas where Salesforce has focused its efforts.

One of the key enhancements provided by CodeGenie is around pull requests and code reviews. CodeGenie automates several aspects of this process, including generating accurate pull request titles, linking relevant work items, and summarizing code changes. It also serves as a virtual code reviewer by identifying potential issues, suggesting optimizations, and flagging missing test cases. This automation significantly reduces the manual effort required during code reviews, freeing developers to concentrate on more complex tasks. Additionally, CodeGenie’s capability to generate unit tests ensures high-quality code and improves overall code coverage.

CodeGenie also offers contextual assistance within the Portal/Developer Hub, which is used by developers for high-fidelity activities such as monitoring releases, change propagation, and fleet operations. As the service fleet grows, navigating this information becomes increasingly complex. CodeGenie provides relevant information and suggests appropriate actions within every page of the portal, enhancing usability and efficiency. Similarly, in Slack, CodeGenie can suggest appropriate next steps following every notification a developer receives, ensuring seamless integration and continuous support throughout the development process.

Driving AI Impact: Key Metrics for Meaningful Outcomes

In the realm of AI development, it’s crucial to base enhancements on solid metrics to ensure that each feature effectively achieves the desired outcomes. From the outset, significant effort has been made to collect detailed telemetry to understand how developers interact with the technology. Initial metrics tracked include monthly active users, lines of code accepted, lines of code retained, and the percentage of code generated and retained. These metrics provide a foundational understanding of user engagement and the effectiveness of the AI.

However, it’s equally important to link these metrics to tangible business outcomes such as reduced cycle times, time savings per developer, and improvements in quality and velocity. Before deploying a new model, an automated benchmark test is conducted to confirm that performance has improved, following methodologies similar to those in past research like Fried et al 2023. This involves measuring the exact-match accuracy of model-infilled lines of code against a held-out test set from internal repositories to ensure there is no degradation in model performance.

Once a model is released to internal developers, a variety of usage metrics are tracked to gauge productivity. This includes measuring the acceptance rate of completions that developers viewed for at least 750 milliseconds, as studied by Tabachnyk et al. 2022 and Murali et al. 2024, and the percentage of a completion that remains unchanged after set time periods up to 30 minutes, as noted by Ziegler et al. 2022.

Since the launch of CodeGenie, there has been significant engagement, with developers accepting over 2 million lines of code and posing more than 500,000 chat questions. This level of interaction underscores the utility and effectiveness of the AI tools being developed, demonstrating their impact on improving developer productivity and software development processes.

Enhancing CodeGenie: Towards a 24/7 Developer Assistant

CodeGenie has demonstrated its effectiveness in various applications using generative AI, and there is now a chance to further enhance its capabilities. Internal developers are actively participating in this effort. A recent hackathon encouraged participants to use AI to boost developer productivity, resulting in 200 developers creating 74 projects. These ranged from IDE agents designed for specific debugging tasks to integrations with other SFDC touchpoints. Inspired by the strong response to developer-led initiatives, an innersourcing model was implemented to officially incorporate these projects into CodeGenie.

The ultimate goal is to enable CodeGenie to operate around the clock, analyzing multiple data sources such as planning data, code, quality metrics, security data, and operational logs. By doing so, it can proactively assist developers by generating comprehensive plans for them to execute.

For instance, CodeGenie could have the capability to analyze a work item and generate a detailed plan that includes creating new code files or editing existing ones, along with generating the necessary test cases. Developers would then simply need to review and possibly tweak these plans before executing them, allowing CodeGenie to implement the necessary changes.

Additionally, CodeGenie could monitor software dependencies and recommend updates to maintain security standards. It could also measure quality metrics like code coverage and code smells to suggest enhancements to existing test cases or the creation of new ones.

Ultimately, the vision for CodeGenie is to transform it into a true 24/7 pair programmer for every developer. This enhancement aims to significantly boost developer experience, productivity, and the overall quality of the output, making CodeGenie an indispensable tool in the software development process.

Learn More

Related Articles

View all