LLM API¶
Provider selection, request fan-out, and structured-output querying.
LLMClient¶
Batch-oriented synchronous client for sampling candidate responses.
LLMClient
¶
LLMClient(
model_names: Union[List[str], str] = "gpt-5.1",
temperatures: Union[float, List[float]] = 0.75,
max_tokens: Union[int, List[int]] = 4096,
reasoning_efforts: Union[str, List[str]] = "disabled",
model_sample_probs: Optional[List[float]] = None,
output_model: Optional[BaseModel] = None,
verbose: bool = True,
)
batch_query
¶
batch_query(
num_samples: int,
msg: Union[str, List[str]],
system_msg: Union[str, List[str]],
msg_history: Union[List[Dict], List[List[Dict]]] = [],
llm_kwargs: List[Dict] = [],
) -> List[QueryResult]
Batch query the LLM with the given message and system message.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
msg
|
str
|
The message to query the LLM with. |
required |
system_msg
|
str
|
The system message to query the LLM with. |
required |
batch_kwargs_query
¶
batch_kwargs_query(
num_samples: int,
msg: Union[str, List[str]],
system_msg: Union[str, List[str]],
msg_history: Union[List[Dict], List[List[Dict]]] = [],
model_sample_probs: Optional[List[float]] = None,
) -> List[QueryResult]
Batch query the LLM with the given message and system message.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
msg
|
str
|
The message to query the LLM with. |
required |
system_msg
|
str
|
The system message to query the LLM with. |
required |
model_sample_probs
|
Optional[List[float]]
|
Sampling probabilities for each model. |
None
|
get_kwargs
¶
Get model kwargs for sampling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_sample_probs
|
Optional[List[float]]
|
Sampling probabilities for each model. |
None
|
AsyncLLMClient¶
Async counterpart for the same provider abstraction.
AsyncLLMClient
¶
AsyncLLMClient(
model_names: Union[List[str], str] = "gpt-5.1",
temperatures: Union[float, List[float]] = 0.75,
max_tokens: Union[int, List[int]] = 4096,
reasoning_efforts: Union[str, List[str]] = "disabled",
model_sample_probs: Optional[List[float]] = None,
output_model: Optional[BaseModel] = None,
verbose: bool = True,
)
batch_query
async
¶
batch_query(
num_samples: int,
msg: Union[str, List[str]],
system_msg: Union[str, List[str]],
msg_history: Union[List[Dict], List[List[Dict]]] = [],
llm_kwargs: List[Dict] = [],
) -> List[QueryResult]
Batch query the LLM with the given message and system message asynchronously.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
msg
|
str
|
The message to query the LLM with. |
required |
system_msg
|
str
|
The system message to query the LLM with. |
required |
batch_kwargs_query
async
¶
batch_kwargs_query(
num_samples: int,
msg: Union[str, List[str]],
system_msg: Union[str, List[str]],
msg_history: Union[List[Dict], List[List[Dict]]] = [],
model_sample_probs: Optional[List[float]] = None,
) -> List[QueryResult]
Batch query the LLM with the given message and system message asynchronously.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
msg
|
str
|
The message to query the LLM with. |
required |
system_msg
|
str
|
The system message to query the LLM with. |
required |
model_sample_probs
|
Optional[List[float]]
|
Sampling probabilities for each model. |
None
|
get_kwargs
¶
Get model kwargs for sampling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_sample_probs
|
Optional[List[float]]
|
Sampling probabilities for each model. |
None
|
Direct Query Helpers¶
Lower-level provider dispatch:
query
¶
query(
model_name: str,
msg: str,
system_msg: str,
msg_history: List = [],
output_model: Optional[BaseModel] = None,
model_posteriors: Optional[Dict[str, float]] = None,
**kwargs
) -> QueryResult
Query the LLM.
query_async
async
¶
query_async(
model_name: str,
msg: str,
system_msg: str,
msg_history: List = [],
output_model: Optional[BaseModel] = None,
model_posteriors: Optional[Dict[str, float]] = None,
**kwargs
) -> QueryResult
Query the LLM asynchronously.
Model Prioritization¶
Bandit-style model prioritization strategies via shinka.llm.prioritization.
Dynamically shifts sampling probability across models based on observed utility
and cost.