December 28, 2024

đź”— Practical Text to SQL from LinkedIn

A great article on how to use LLMs to generate SQL queries from natural language. Albert Chen’s deep dive into LinkedIn’s SQL Bot offers a fascinating glimpse into the marriage of generative AI with enterprise-scale data analytics. This multi-agent system, integrated into LinkedIn’s DARWIN platform, exemplifies how cutting-edge AI can democratize access to data insights while enhancing efficiency across teams.

Key Takeaways: Empowering Data Democratization: SQL Bot addresses a classic bottleneck: dependency on data teams for insights. By enabling non-technical users to autonomously query databases using natural language, LinkedIn has transformed a time-intensive process into a streamlined, scalable solution.

Data Cleaning and Annotation:

we initiated a dataset certification effort to collect comprehensive descriptions for hundreds of important tables. Domain experts identified key tables within their areas and provided mandatory table descriptions and optional field descriptions. These descriptions were augmented with AI-generated annotations based on existing documentation and Slack discussions, further enhancing our ability to retrieve the right tables and use them properly in queries

  • They infer the tables that a user cares about based on their org chart. Metadata and Knowledge Graphs as Pillars:
  • Comprehensive dataset certification, enriched with AI-generated annotations, ensures accurate table retrieval despite LinkedIn’s vast table inventory (in the millions!). By combining domain knowledge, query logs, and example queries into a knowledge graph, SQL Bot builds a robust contextual foundation for query generation.
  • LLM-Driven Iterative Query Refinement: Leveraging LangChain and LangGraph, the system iteratively plans and constructs SQL queries. Validators and self-correction agents ensure outputs are precise, efficient, and error-free, highlighting the sophistication of LinkedIn’s text-to-SQL pipeline.

Personalized and Guided User Experiences: With features like quick replies, rich display elements, and a guided query-writing process, SQL Bot prioritizes user understanding and engagement. Its integration with DARWIN, complete with saved chat history and custom instructions, amplifies its accessibility and adoption.

Benchmarking and Continuous Improvement: They have an emphasis on benchmarking. They use human evaluation alongside LLM-as-a-judge methods, so they’ve developed a scalable approach to query assessment and model enhancement.

Reflections: Many text-to-SQL solutions stumble in handling many tables, LinkedIn’s SQL Bot thrives by leveraging metadata, personalized retrieval, and user-friendly design. It’s also impressive how the system respects permissions, ensuring data security without sacrificing convenience.

Moreover, the survey results—95% user satisfaction with query accuracy—highlight the system’s impact. This balance of technical innovation and user-centric design offers a blueprint for organizations looking to replicate LinkedIn’s success.

Why It Matters: I think there are enough details in this article to either add new features to your existing text to sql bot, or to build your own. LinkedIn’s work on SQL Bot is a testament to the power of AI in reshaping how we interact with complex data systems. It’s an inspiring read for engineers, data scientists, and AI enthusiasts aiming to make SQL data more accessible.

🔗 •

December 27, 2024

đź”— Cognitiave load importance

A live article on measuring cognitive load by Artem Zakirullin. It’s being cotinuously updated, but one clip that resonates with me is:

Once you onboard new people on your project, try to measure the amount of confusion they have (pair programming may help). If they’re confused for more than ~40 minutes in a row - you’ve got things to improve in your code.

With AI tooling, the amount of code is bound to grow. We should make sure to be cognizent of the cognitive load required to maintain it all.

🔗 •
đź”— Building effective agents

Agents are an overloaded term in AI. I found this article to be particularly helpful in understanding the current definition of agenticy systems, agents, and workflows. Plus it included a survey of some popular workflow architectures. To me, it all starts looking a lot like the microservices architectures as those were becoming more and more popular by Amazon over the past decade.

One principal to follow when building (software in general) agentic systems…

When building applications with LLMs, we recommend finding the simplest solution possible, and only increasing complexity when needed. This might mean not building agentic systems at all. Agentic systems often trade latency and cost for better task performance, and you should consider when this tradeoff makes sense.

Their definitions of agents…

  • Agentic systems - parent term for an entire app or system
  • Workflows - systems where multiple LLMs are orchestrated together with code
  • Agents - LLMs dynamically direct their own processes and tool usage

Their walkthrough of common workflow patterns was interesting as well. They walk through prompt chaining, routing parallelization, orchestrator workers, and evaluator optimizer. I think as reasoning models like o1 become more powerful, we’ll start to see these workflows become less important. Let’s look at Evaluator-optimizer:

Evaluator Optimizer

This is just one LLM providing content or a response and another LLM call provides feedback or an evaluation. It’s useful if we have evaluation criteria.

As reasoning models increase, more applications will move from worklfows to agents. Models like o1 that are doing multi step reasoning, one of the first types of paths I expect them to take, would be the iterative loop above. Let’s try it with Google Gemini’s new 2.0 Flash Thinking Experimental:

Prompt đź’­

Generate a day-by-day training plan for a reactive dog exhibiting specific behaviors: lunging, barking at strangers. The plan should incorporate positive reinforcement techniques and gradually desensitize the dog to triggers over 4 weeks. Each day’s plan must include:
•	A clearly defined goal for that day.
•	Detailed instructions for the owner, including step-by-step exercises.
•	A checklist to measure the dog’s progress.

Evaluation Criteria:

1. Clarity: Are the daily goals and instructions written in a way that is easy to understand for a non-expert dog owner?
2. Feasibility: Are the exercises practical for an average dog owner to implement with minimal equipment or prior training experience?
3. Progressiveness: Does the plan show a clear progression from simple to complex tasks aligned with the dog’s desensitization process?
4. Effectiveness: Based on canine training principles, is the plan likely to improve the dog’s behavior when executed consistently?
5. Empathy: Does the tone and approach demonstrate understanding of the challenges faced by both the dog and the owner?

Thinking đź§ 

This is a stripped down version of Gemini’s thought process.

1. Deconstruct the Request
2. Establish Foundational Principles: Positive Reinforcement, Classical Conditioning , Operant Conditioning, Gradual Exposure, Consistency is Key, Manage the Environment, Listen to Your Dog, Patience and Understanding
3. Structure the 4 Weeks: Week 1: Foundation Building, Week 2: Controlled Exposure (Distance), Week 3: Controlled Exposure (Closer), Week 4: Generalization and Maintenance
4. Design Daily Goals
5. Develop Detailed Instructions
6. Create Progress Checklists
7. Incorporate Evaluation Criteria Throughout
8. Refine and Review
9. Add Important Considerations
10. Self-Correction Example during the Process
    - Initial Thought, Correction, Revised Approach

Check it out! This thinking process is actually a perfect example of how our brains bounce between coming up with ideas and checking if they’re any good. Like, there’s this awesome moment where they think “hey, maybe let’s throw the dog right into meeting strangers” but then the evaluation part of their brain kicks in with “whoa, hold up, that’s way too intense for a scared pup.” The whole thing is basically a dance between brainstorming cool training ideas and then reality-checking them against what actually works for reactive dogs and their humans - it’s like having an excited idea machine and a practical bouncer working together in your head to build something that actually makes sense.

The parallel between AI workflow architectures and the evolution of microservices hints at a deeper pattern in software engineering - we repeatedly solve the problem of complexity by breaking it down into smaller, specialized components that communicate in structured ways. Additionally, as models like OpenAI’s o1 (or even o3) become more sophisticated at multi-step reasoning, we may see a shift back toward unified systems. This raises an interesting question: Is the current trend toward complex workflow architectures a permanent evolution in AI system design, or just a temporary solution until single models become capable enough to handle these tasks autonomously? The answer could fundamentally reshape how we build AI systems in the coming years.

🔗 •

AWS Amplify and WAF

TIL that AWS Amplify added integration with AWS WAF.

Amplify Console
As I was setting up this blog, I stumpled across a new feature that AWS Amplify added. It looks like this came out just a couple of days ago, and is in public preview. I’m excited about this because it can do IP Blocking which you previously had to implement manually OR just use a username/password but expose the app to the public internet.

🔗 •
View All Posts