LLMJoin
Usage
LLMJoin
Bases: JoinIngredient
from_args(model=None, use_skrub_joiner=True, few_shot_examples=None, k=None)
classmethod
Creates a partial class with predefined arguments.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
few_shot_examples
|
List[dict]
|
A list of AnnotatedJoinExamples dictionaries for few-shot learning. If not specified, will use default_examples.json as default. |
None
|
use_skrub_joiner
|
bool
|
Whether to use the skrub joiner. Defaults to True. |
True
|
k
|
Optional[int]
|
Determines number of few-shot examples to use for each ingredient call. Default is None, which will use all few-shot examples on all calls. If specified, will initialize a haystack-based DPR retriever to filter examples. |
None
|
Returns:
Type | Description |
---|---|
Type[JoinIngredient]: A partial class of JoinIngredient with predefined arguments. |
Examples:
from blendsql import blend, LLMJoin
from blendsql.ingredients.builtin import DEFAULT_JOIN_FEW_SHOT
ingredients = {
LLMJoin.from_args(
few_shot_examples=[
*DEFAULT_JOIN_FEW_SHOT,
{
"join_criteria": "Join the state to its capital.",
"left_values": ["California", "Massachusetts", "North Carolina"],
"right_values": ["Sacramento", "Boston", "Chicago"],
"mapping": {
"California": "Sacramento",
"Massachusetts": "Boston",
"North Carolina": "-"
}
}
],
# Will fetch `k` most relevant few-shot examples using embedding-based retriever
k=2
)
}
smoothie = blend(
query=blendsql,
db=db,
ingredients=ingredients,
default_model=model,
)
run(model, left_values, right_values, question=None, few_shot_retriever=None, **kwargs)
Description
This ingredient handles the logic of semantic JOIN
clauses between tables.
In other words, it creates a custom mapping between a pair of value sets. Behind the scenes, this mapping is then used to create an auxiliary table to use in carrying out an INNER JOIN
.
For example:
SELECT Capitals.name, State.name FROM Capitals
JOIN {{
LLMJoin(
'Align state to capital',
left_on='States::name',
right_on='Capitals::name'
)
}}
States
and Capitals
with no foreign key to join the two?
BlendSQL was built to interact with tables "in-the-wild", and many (such as those on Wikipedia) do not have these convenient properties of well-designed relational models.
For this reason, we can leverage the internal knowledge of a pre-trained LLM to do the JOIN
operation for us.