Query CSV

AWS S3 Only: This flow uses S3 Select and only works with AWS S3 (or S3-compatible storage that supports S3 Select). It will NOT work with DigitalOcean Spaces or other providers that do not support S3 Select.

Query CSV allows you to run SQL-like queries directly on CSV files stored in S3. Instead of downloading the entire file, S3 filters the data and returns only matching rows.

Parameters

ParameterRequiredDescription
keyYesThe path/filename of the CSV file to query (e.g., "reports/sales.csv")
queryYesSQL query to execute (e.g., "SELECT * FROM s3object WHERE status='active'")

 

Example Payload

{
    "key": "reports/users.csv",
    "query": "SELECT * FROM s3object WHERE status='active'"
}

SQL Query Examples

QueryDescription
SELECT * FROM s3objectReturns all rows
SELECT * FROM s3object WHERE status='active'Returns only rows where status is "active"
SELECT name, email FROM s3objectReturns only the name and email columns
SELECT * FROM s3object WHERE id='2'Returns the row with id 2
SELECT * FROM s3object LIMIT 10Returns only the first 10 rows

 

Response

On success, returns:

  • data - The filtered CSV content (only matching rows)
  • data.responseCode - HTTP status code (200 for success)

On error, returns:

  • error - Error details from S3
  • error.responseCode - HTTP error code
  • error.Message - Error description

 

How It Works

Query CSV uses S3 Select, a feature that allows you to retrieve a subset of data from an object using SQL expressions. This is more efficient than downloading entire files when you only need specific data.

  • The flow sends a POST request to S3 with your SQL query in an XML payload
  • S3 parses the CSV file and executes the query server-side
  • Only matching rows are returned, reducing data transfer and processing time
  • The first row of your CSV must contain column headers (these become field names in queries)

 

Supported SQL Syntax

  • SELECT - Specify columns (* for all, or comma-separated column names)
  • FROM s3object - Always use "s3object" as the table name
  • WHERE - Filter rows using conditions (=, <>, <, >, <=, >=, AND, OR, NOT)
  • LIMIT - Restrict the number of returned rows

 

 

Was this article helpful?