Playground
The AkashML Playground is a chat interface for interacting with open source AI models in real time. Configure model parameters, compare outputs, and test prompts before integrating them into your application.
Starting a conversation
Use the model selector in the sidebar to browse available models. Each model card shows:
- Provider and family (e.g. Meta Llama, DeepSeek)
- Parameter count and context length
- Pricing per million input and output tokens
Select a model, adjust parameters (temperature, max tokens, system prompt), and type a message. The Playground streams responses token by token using the same API your application would call.
You can switch models mid-session without clearing history, useful for comparing how different models handle the same prompt.
Parameter controls
Common parameters
- Temperature — Controls randomness. Lower values (0.1–0.3) produce focused, deterministic outputs. Higher values (0.8–1.5) increase creativity and diversity.
- Max Tokens — Caps the length of the generated response. The maximum depends on the model's context window.
- Top P — Nucleus sampling. The model considers tokens whose cumulative probability reaches this threshold. An alternative to temperature for controlling diversity.
- Top K — Limits token selection to the top K most probable tokens at each step. Some models hide this parameter.
- Frequency Penalty — Reduces repetition of tokens proportional to how often they have already appeared.
- Presence Penalty — Reduces repetition of any token that has appeared at all, regardless of frequency.
- Repetition Penalty — A general multiplier that penalizes repeated tokens. Available on some models as an alternative to frequency/presence penalty.
Some models hide certain parameters that don't apply to their architecture. The Playground automatically shows only the relevant controls.
Reasoning mode
Models that support reasoning (indicated by supportsForceReasoning in their settings) can be switched into thinking mode. When enabled, the model produces a reasoning_content trace alongside the regular response, showing its step-by-step thought process.
Toggle reasoning mode in the sidebar when available. The reasoning trace appears in the chat alongside the final answer.
System prompts
Set a system prompt to define the model's behavior for the entire conversation. The system prompt is sent as the first message with role: "system". Common uses:
- Define a persona — "You are a senior Python developer who writes concise, well-tested code."
- Set output format — "Always respond in valid JSON with keys: answer, confidence, sources."
- Add constraints — "Keep responses under 100 words. Do not use markdown."
Some models have a suggested system prompt placeholder. You can see it as grey text in the system prompt field.
Presets
A preset bundles a model, system prompt, and parameter set into a named configuration. Create presets from the Playground sidebar to quickly load a proven setup.
Click Save Preset to store the current configuration:
- Model selection
- All parameter values (temperature, max tokens, top P, top K, penalties)
- System prompt
Presets are saved to your account and persist across sessions. Load a preset to instantly restore a full configuration.
Inferences
Every inference you run is logged and accessible from Playground → Inferences. Each entry shows the model used, token counts, and the total costs.
Use the inferences log to:
- Review token usage per request
- Monitor and compare costs
Conversation management
- Each conversation maintains a full message history that is sent with every request, so the model has context of the entire chat.
- Clear the conversation to start fresh — this resets the message history but keeps your parameter settings.
- Messages include the
role(userorassistant) andcontent. When reasoning mode is enabled, assistant messages may also includereasoning_content.
Keyboard shortcuts
| Shortcut | Action |
|---|---|
Enter | Send message |
Shift + Enter | New line in message |
Tips for better results
- Be specific — Instead of "tell me about AI", try "explain the difference between supervised and unsupervised learning in 3 bullet points".
- Iterate on system prompts — Small wording changes can significantly alter output quality and format.
- Use lower temperature for factual tasks — Set temperature to 0.1–0.3 for code generation, data extraction, or factual Q&A.
- Use reasoning mode for complex problems — Enable thinking on supported models for math, logic, and multi-step reasoning tasks.
- Compare models — Try the same prompt across different models to find the best fit for your use case. Featured models are a good starting point.
- Watch your credits — The Dashboard shows real-time token consumption and cost breakdowns per model. Use the usage chart to track spending trends.