LLM API¶

Provider selection, request fan-out, and structured-output querying.

`LLMClient`¶

Batch-oriented synchronous client for sampling candidate responses.

LLMClient ¶

LLMClient(
    model_names: Union[List[str], str] = "gpt-5.1",
    temperatures: Union[float, List[float]] = 0.75,
    max_tokens: Union[int, List[int]] = 4096,
    reasoning_efforts: Union[str, List[str]] = "disabled",
    model_sample_probs: Optional[List[float]] = None,
    output_model: Optional[BaseModel] = None,
    verbose: bool = True,
)

batch_query ¶

batch_query(
    num_samples: int,
    msg: Union[str, List[str]],
    system_msg: Union[str, List[str]],
    msg_history: Union[List[Dict], List[List[Dict]]] = [],
    llm_kwargs: List[Dict] = [],
) -> List[QueryResult]

Batch query the LLM with the given message and system message.

Parameters:

Name	Type	Description	Default
`msg`	`str`	The message to query the LLM with.	required
`system_msg`	`str`	The system message to query the LLM with.	required

batch_kwargs_query ¶

batch_kwargs_query(
    num_samples: int,
    msg: Union[str, List[str]],
    system_msg: Union[str, List[str]],
    msg_history: Union[List[Dict], List[List[Dict]]] = [],
    model_sample_probs: Optional[List[float]] = None,
) -> List[QueryResult]

Batch query the LLM with the given message and system message.

Parameters:

Name	Type	Description	Default
`msg`	`str`	The message to query the LLM with.	required
`system_msg`	`str`	The system message to query the LLM with.	required
`model_sample_probs`	`Optional[List[float]]`	Sampling probabilities for each model.	`None`

get_kwargs ¶

get_kwargs(model_sample_probs: Optional[List[float]] = None)

Get model kwargs for sampling.

Parameters:

Name	Type	Description	Default
`model_sample_probs`	`Optional[List[float]]`	Sampling probabilities for each model.	`None`

`AsyncLLMClient`¶

Async counterpart for the same provider abstraction.

AsyncLLMClient ¶

AsyncLLMClient(
    model_names: Union[List[str], str] = "gpt-5.1",
    temperatures: Union[float, List[float]] = 0.75,
    max_tokens: Union[int, List[int]] = 4096,
    reasoning_efforts: Union[str, List[str]] = "disabled",
    model_sample_probs: Optional[List[float]] = None,
    output_model: Optional[BaseModel] = None,
    verbose: bool = True,
)

batch_query `async` ¶

batch_query(
    num_samples: int,
    msg: Union[str, List[str]],
    system_msg: Union[str, List[str]],
    msg_history: Union[List[Dict], List[List[Dict]]] = [],
    llm_kwargs: List[Dict] = [],
) -> List[QueryResult]

Batch query the LLM with the given message and system message asynchronously.

Parameters:

Name	Type	Description	Default
`msg`	`str`	The message to query the LLM with.	required
`system_msg`	`str`	The system message to query the LLM with.	required

batch_kwargs_query `async` ¶

batch_kwargs_query(
    num_samples: int,
    msg: Union[str, List[str]],
    system_msg: Union[str, List[str]],
    msg_history: Union[List[Dict], List[List[Dict]]] = [],
    model_sample_probs: Optional[List[float]] = None,
) -> List[QueryResult]

Batch query the LLM with the given message and system message asynchronously.

Parameters:

Name	Type	Description	Default
`msg`	`str`	The message to query the LLM with.	required
`system_msg`	`str`	The system message to query the LLM with.	required
`model_sample_probs`	`Optional[List[float]]`	Sampling probabilities for each model.	`None`

get_kwargs ¶

get_kwargs(model_sample_probs: Optional[List[float]] = None)

Get model kwargs for sampling.

Parameters:

Name	Type	Description	Default
`model_sample_probs`	`Optional[List[float]]`	Sampling probabilities for each model.	`None`

Direct Query Helpers¶

Lower-level provider dispatch:

query ¶

query(
    model_name: str,
    msg: str,
    system_msg: str,
    msg_history: List = [],
    output_model: Optional[BaseModel] = None,
    model_posteriors: Optional[Dict[str, float]] = None,
    **kwargs
) -> QueryResult

Query the LLM.

query_async `async` ¶

query_async(
    model_name: str,
    msg: str,
    system_msg: str,
    msg_history: List = [],
    output_model: Optional[BaseModel] = None,
    model_posteriors: Optional[Dict[str, float]] = None,
    **kwargs
) -> QueryResult

Query the LLM asynchronously.

Model Prioritization¶

Bandit-style model prioritization strategies via shinka.llm.prioritization. Dynamically shifts sampling probability across models based on observed utility and cost.

LLM API¶

LLMClient¶

LLMClient ¶

batch_query ¶

batch_kwargs_query ¶

get_kwargs ¶

AsyncLLMClient¶

AsyncLLMClient ¶

batch_query async ¶

batch_kwargs_query async ¶

get_kwargs ¶

Direct Query Helpers¶

query ¶

query_async async ¶

Model Prioritization¶

`LLMClient`¶

`AsyncLLMClient`¶

batch_query `async` ¶

batch_kwargs_query `async` ¶

query_async `async` ¶