Announcing Maxim AI’s general availability and seed round

Announcing Maxim AI’s general availability and seed round

Today, we are excited to announce the general availability of Maxim AI’s evaluation platform. We are also thrilled to partner with the incredible team at Elevation Capital and the fantastic set of founders and operators who share our vision to accelerate the future of AI development!

How we started

During my time building products in the Google Assistant NLP and developer platform teams, the need and the challenges around evaluating LLM-based applications became starkly evident as we employed LLMs in our stack.

Over the past 1.5 years, as these powerful large language models became accessible via APIs to ~30M developers, organizations across the board began to re-anchor their software engineering teams to build AI applications. Today, these developers drive the bulk of development in the AI application layer. These developers were traditionally building software in a deterministic paradigm with standardised best practices for predictably testing and systematically improving products. This test-driven development was tightly integrated into the software development lifecycle. 

Generative AI products, however, are built in a non-deterministic paradigm with unpredictable variability in quality and performance depending on factors, including models, parameters,  data, context, or simply the framing of the question. Moreover, challenges around hallucinations, inaccuracies, safety, and output structure pose significant risks to user trust and organizational brand reputation; recent examples include Air Canada’s ‘Remarkable’ Lying AI Chatbot, the Chevrolet dealership chatbot selling a Chevy Tahoe for $1, or Google Gemini’s controversial image bot.

Today, organizations are resorting to non-scalable techniques and high-paid manual efforts, resulting in tediously slow iteration cycles as they test and ship their AI to production. Many organizations only observe AI performance post-deployment and make reactive improvements. They lack the foundation to systematically evaluate whether they are improving or regressing as they adapt to a newly released SOTA model or a simple change in their existing pipeline.

Maxim AI’s infrastructure, which sits between the model and application layers of the GenAI stack, aims to bring the best practices of test-driven development to streamline AI development workflows. Along with my co-founder Akshay, who has been building products for developers throughout his career, and our fantastic team, I am excited to be on this ambitious mission to accelerate the future of how developers build the next generation of software.

Our journey so far

Since starting Maxim last year, we have been moving at an incredible pace to empower AI developers in leading companies across Enterprise software, GenAI services, BFSI, and EdTech sectors to ship their products with speed, reliability, and confidence.

Over the last few months, we have developed the end-to-end evaluation stack for AI development, comprising: 

  1. Experimentation suite, which enables teams to rapidly, systematically, and collaboratively iterate on prompts, models, parameters, and other components of their compound AI systems during the prototype stage and identify the optimal combinations for their use case. This includes prompt CMS, prompt IDE, visual workflow builder, and connectors to external data sources/functions.
  2. Pre-release evaluation toolkit, which offers a unified framework for machine and human evaluation, enabling teams to quantitatively determine improvements or regressions for their application on large test suites and deploy confidently. It also comprises Maxim’s evaluator store, offering our proprietary pre-built and custom evaluation models. The framework integrates seamlessly with the development team’s CI/CD workflows.
  3. Observability suite, which empowers developers to monitor real-time production logs and run them through automated evaluations to ensure in-production quality and safety, thus helping them optimize their AI systems systematically
  4. Data engine, which enables teams to seamlessly tailor multimodal datasets for their RAG, fine-tuning, and evaluation needs.

We've built our platform from the ground up, keeping in mind the needs of large AI organizations. This includes seamless in-VPC deployments, robust role-based access control, compliance, and enhanced collaboration features. Additionally, we provide custom dataset support and Maxim-managed human evaluation for the last mile of AI deployment.

It's early days, but the teams we have been working closely with have been able to test, iterate, and ship ~5x faster, leveraging the Maxim platform. The product is anchored to these fundamental principles:

  1. Developer-experience. We are singularly focused on AI application developers – we are from the devtools industry, and we are creating for the devtools industry. Our developer experience is deeply anchored in the best practices of traditional software deployment, be it evaluations without any SDK integrations or seamless CI/CD integrations.
  2. End-to-end platform. We strongly believe that a single, integrated platform to help businesses manage all testing-related needs across the AI development lifecycle will drive real productivity and quality gains for building enduring applications. We at Maxim are building just that: the end-to-end evaluation stack for AI development
  3. Solving for the last mile. We are all bullish about automated evaluations; however, today, the last mile of AI deployment has to involve humans. We aim to accelerate this collaboration between humans and AI to supercharge AI development to the last mile.

Our excitement about the future 

While this is an exciting milestone, it’s just the start! Maxim is committed to empowering developers to build world-class AI solutions for the world’s most pressing problems. All of us here continue to be inspired by this tremendous opportunity. 

We are aggressively expanding our platform capabilities across the evaluation stack and multi-modal data engine to accelerate the reliable and scalable deployment of AI products. We are also innovating ways to extend these capabilities to multi-modal and agentic AI applications of the future.

Big things are coming ahead ⚡, and we are excited to partner with more developers who are building the future of AI. If the challenges we are solving resonate with you, we’d love to chat!

We are hiring across functions: AI research, Applied AI, Full-stack software engineering, and GTM. Join our team if our mission inspires you.