Skip to content

General Syntax

ValueArray

A ValueArray is a reference to a list of values. This can be written using:

  • '{value1};{value2}...' syntax

  • '{table}::{column}' syntax

  • A BlendSQL query which returns a 1d array of values ((SELECT value FROM table WHERE ...))

Passing options

The functions LLMMap and LLMQA support the passing of an options argument. This will constrain the output of the functions to only values appearing in the passed ValueArray.

SELECT Id FROM posts p
WHERE {{
    LLMMap(
        'What is the sentiment of this post?',
        'p::Body',
        options='positive;negative'
}}

Functions

LLMQA

The LLMQA is an aggregate function that returns a single scalar value.

Quantifier

An optional quantifier argument can be passed, which will be used to modify the regular expression pattern powering the constrained decoding. The following greedy quantifiers are valid:

  • '*', meaning 'zero-or-more'
  • '+', meaning 'one-or-more'
  • Any string matching the pattern {\d(,\d)?}
def LLMQA(
    question: str,
    context: t.Optional[Query] = None,
    options: t.Optional[ValueArray] = None,
    return_type: t.Optional[ReturnType] = None,
    regex: t.Optional[str] = None,
    quantifier: t.Optional[Quantifier] = None
):
    ...

Examples:

SELECT preferred_foot FROM Player p
WHERE p.player_name = {{
    -- With `infer_gen_constraints=True` (which is default),
    --  `options` will automatically be inferred, and the below
    --  will select from a value in the `p.player_name` column.
    LLMQA(
        "Which player has the most Ballon d'Or awards?"
    )
}}

SELECT name FROM state_flowers
WHERE state = {{
    LLMQA(
        "Which state is known as 'The Golden State'?",
        -- Pass context via a subquery
        (SELECT title, content FROM documents)
    )
}}
-- Generate 3 values in our generated tuple
SELECT * FROM VALUES {{LLMQA('What are the first 3 letters of the alphabet?', quantifier='{3}')}}

LLMMap

The LLMMap is a unary scalar function, much like LENGTH or ABS in SQlite. The output of this function is set as a new column in a temporary table, for later use within the wider query.

def LLMMap(
    question: str,
    values: ValueArray,
    options: t.Optional[ValueArray] = None,
    return_type: t.Optional[ReturnType] = None,
    regex: t.Optional[str] = None
):
    ...

Examples:

SELECT COUNT(DISTINCT(s.CDSCode)) FROM schools s
JOIN satscores sa ON s.CDSCode = sa.cds
WHERE sa.AvgScrMath > 560
-- With `infer_gen_constraints=True`, generations below will be restricted to a boolean.
AND {{LLMMap('Is this a county in the California Bay Area?', 's::County')}} = TRUE

SELECT GROUP_CONCAT(Name, ', ') AS Names,
{{
    LLMMap(
        'In which time period was this person born?',
        'People::Name',
        options='Eras::Years'
    )
}} AS Born
FROM People
GROUP BY Born

LLMJoin

The LLMJoin function can be used to perform semantic entity linking between columns in tables. It is commonly used in conjunction with a documents table, to fetch articles related to a value in another table.

def LLMJoin(
    left_on: ValueArray,
    right_on: ValueArray,
    join_criteria: t.Optional[str]
):
    ...

Examples:

-- Get all articles on players older than 21
SELECT * FROM Player p
JOIN {{
    LLMJoin(
        'p::Name',
        'documents::title'
    )
}} WHERE p.age > 21

SELECT fruits.name, colors.name FROM fruits
JOIN {{
    LLMJoin(
        'fruits::name',
        'colors::name',
        -- If we need to, we can pass a join_criteria.
        -- Otherwise, the default 'Join by topic' is used.
        'Align the fruit to its color.'
    )
}}