Kipi.ai / Insights / Blogs / From Images to Insight in Snowflake: A Hands-On Snowflake Cortex COMPLETE Multimodal POC

From Images to Insight in Snowflake: A Hands-On Snowflake Cortex COMPLETE Multimodal POC

By Aspanna Yaswanth and Vijayaraju Yarra

Executive Summary

This proof-of-concept (POC) validates Snowflake Cortex COMPLETE Multimodal for common enterprise tasks-reading code from images, captioning and classifying images, and extracting insights from charts-directly in Snowflake. Results show it’s fast to prototype and good enough for many internal analytics and document workflows, with important caveats around precision, scaling limits, and non-interactive prompts.

Introduction

The ability to derive insights from data has long been central to modern business. But what happens when that data isn’t just structured rows and columns, but also images, charts, and even complex documents? This is where Snowflake Cortex COMPLETE Multimodal comes into play, a versatile function designed to bring the power of robust text and vision models directly into your Snowflake environment, all through user-friendly, instruction-based SQL.

We recently conducted a Proof of Concept (POC) to test the core capabilities of Cortex COMPLETE Multimodal, and the results are truly impressive, demonstrating how Snowflake is unifying the processing of diverse data types within its secure platform.


Turn insight into action. Book a session with our team.


Cortex COMPLETE: Your Gateway to Multimodal AI

At its heart, Cortex COMPLETE (specifically SNOWFLAKE.CORTEX.COMPLETE) allows you to interact with powerful AI models using simple SQL syntax. It supports a range of tasks, including:

  • Comparing images: Analyze visual content for differences or similarities.
  • Captioning images: Generate descriptive text for images.
  • Classifying images: Categorize images based on their content.
  • Extracting entities from images: Identify and pull specific information from visual data.
  • Answering questions: Get insights from data presented in graphs and charts.

The beauty lies in its simplicity: you point to a model, provide a prompt (which can include references to images stored in Snowflake stages), and let Cortex do the heavy lifting.

SQL

Images from Kaggle and other sources were imported into a Snowflake stage. We then executed various queries to test each core capability, primarily utilizing the claude-3-5-sonnet for these experiments.
1. Reading and Analyzing Graphs & Charts

Extracting insights from visual data like graphs is a powerful capability.

The Task: Analyze an “Electronic Store Sales Order Analysis” chart.

The Output: The model provided a good summary of trends for Mobile, Sound System, and TV sales, identifying peaks, declines, and overall patterns.

Our Observation: While the analysis was generally well-done, the model cited specific, precise order numbers that weren’t explicitly available in the graph, which only presented approximate figures. This suggests a need for careful interpretation of quantitative results from visual analysis.

We also tested with more complex charts, such as “TSA Passenger Volume” and “World Wine Consumption and Vine Acres,” where the model provided accurate and neat explanations.

However, a “Complex Chart – 3” depicting “adult population growth rates” showed inaccuracies in identifying specific country colors and making some general observations.

This highlights the ongoing development of these models and the importance of validation, especially with nuanced or less common chart types.

2. Comparing Images: Optimizing SQL with AI Vision

Imagine needing to compare two versions of a SQL query presented as images and identify the more optimized one. Cortex COMPLETE can do just that.

Our SQL Query:

SQL

The Output: The model successfully identified the optimized query and provided a detailed explanation, highlighting how the second query avoided unnecessary subqueries, selected only needed columns, and eliminated redundant checks. This showcases its ability to extract text accurately from images and apply logical reasoning.

3. Extracting Text from Zoomed-Out Images

We tested the model’s robustness with a zoomed-out image containing text.

Our SQL Query:

SQL

The Observation: The model read the text with minor errors. This crucial insight underscores a fundamental principle of AI: output quality is directly related to input quality. Clear, high-resolution images yield better results.

4. Image Captioning

Generating descriptive captions for images is another strong feature.

Our SQL Query:

The Output: The model generated accurate captions for images of a musician, a construction worker, and children playing with LEGOs. This demonstrates its ability to understand context and identify key elements within images.

Key Observations & Potential Challenges

Our POC provided valuable insights beyond just successful executions:

  • Input Quality Matters: The accuracy of Cortex COMPLETE’s output is highly dependent on the clarity and quality of the input images.
  • Image Count Limit Discrepancy: Despite official documentation stating support for up to 100 images, we encountered an error when submitting more than 20 images (“invalid request parameters: max image count exceeded”). This is a significant finding for batch processing workflows.
  • Model Transparency: As with many LLMs, the “black box” nature of the model’s decision-making process can make troubleshooting challenging.
  • Token Limits: The claude-3-5-sonnet model has a 200,000 token limit for both input and output, which users must manage to avoid errors.
  • Regional Availability: The model’s availability is tied to specific Snowflake regions, though cross-region inference can be enabled.
  • Stateless Interactions: COMPLETE is designed for one-time prompt completions; it does not support interactive chat contexts or follow-up questions within the same session

Conclusion

Our deep dive into Snowflake Cortex COMPLETE Multimodal unequivocally demonstrates its potential to revolutionize how enterprises interact with and extract value from their diverse data assets. While continuous improvements are expected (especially regarding image batching and nuanced chart interpretation), the current capabilities are robust. Snowflake Cortex is not just a feature; it’s an intelligent platform that empowers users to build sophisticated AI applications with the simplicity and scalability of SQL, truly bringing AI closer to your data.

About kipi.ai

Kipi.ai, a WNS Company, is a global leader in data modernization and democratization focused on the Snowflake platform. Headquartered in Houston, Texas, Kipi.ai enables enterprises to unlock the full value of their data through strategy, implementation and managed services across data engineering, AI-powered analytics and data science.

As a Snowflake Elite Partner, Kipi.ai has one of the world’s largest pools of Snowflake-certified talent—over 600 SnowPro certifications—and a portfolio of 250+ proprietary accelerators, applications and AI-driven solutions. These tools enable secure, scalable and actionable data insights across every level of the enterprise. Serving clients across banking and financial services, insurance, healthcare and life sciences, manufacturing, retail and CPG, and hi-tech and professional services, Kipi.ai combines deep domain excellence with AI innovation and human ingenuity to co-create smarter businesses. As a part of WNS, Kipi.ai brings global scale and execution strength to accelerate Snowflake-powered transformation world-wide.

For more information, visit www.kipi.ai.

August 26, 2025