Workshop
Ollama SDK Hands-On Workshop
This workshop provides a comprehensive, hands-on introduction to the @tekimax/ollama-sdk. By the end of this workshop, you’ll have practical experience using both the command-line interface and the programmatic API for working with large language models through Ollama.
Table of Contents
- Prerequisites
- Part 1: CLI Basics
- Part 2: API Fundamentals
- Part 3: Advanced Topics
- Part 4: Real-World Projects
Prerequisites
Before starting this workshop, ensure you have:
- Node.js (v16 or later) installed
- Ollama installed and running locally
- Basic familiarity with JavaScript/TypeScript and command-line operations
- A code editor (VS Code recommended)
- At least 8GB RAM (16GB+ recommended for larger models)
- Terminal or command prompt access
Part 1: CLI Basics
This section focuses on using the command-line interface to interact with Ollama models.
Setting Up
- Install the Ollama SDK globally:
- Verify installation:
You should see the help output with available commands and options.
Exploring Available Models
Let’s start by exploring what models are available in your Ollama installation:
This command displays all models currently downloaded to your system, along with their sizes.
Workshop Activity #1:
- Run the list command
- Note which models are available
- If you don’t have any models yet, the list will be empty
Basic Text Generation
Generate your first text using a model:
If you don’t have the llama2 model, replace it with any model from your list, or pull llama2 first:
Workshop Activity #2:
- Generate text with a simple prompt
- Try different prompts and observe the responses
- Note the response time and quality
Working with Parameters
The CLI supports several parameters to control text generation:
Parameter Explanation:
-m, --model
: The model to use (e.g., llama2, mistral)-p, --prompt
: The input text prompt-s, --stream
: Stream the response token-by-token instead of waiting for the complete response-t, --temperature
: Controls randomness (0.0 = deterministic, 1.0 = maximum creativity)--top-p
: Controls diversity through nucleus sampling (0.0-1.0)
Workshop Activity #3:
- Compare output with different temperature settings:
- Low temperature (0.2): More focused, deterministic responses
- Medium temperature (0.5): Balanced randomness
- High temperature (0.8): More creative, varied responses
- Document the differences in outputs
Creating Embeddings
Embeddings are vector representations of text that capture semantic meaning:
The output will be a high-dimensional vector (usually displayed as the first few values).
Workshop Activity #4:
- Create embeddings for several related sentences
- Create embeddings for unrelated sentences
- Note the dimension of the embeddings (will vary based on model)
Managing Models
You can download (pull), view details, and remove models:
Workshop Activity #5:
- Pull a small model like orca-mini (if not already present)
- View its details with the show command
- Compare the license and size information with another model
CLI Exercises
Exercise 1: Text Comparison
- Generate descriptions of the same topic with three different temperature settings
- Compare how the outputs differ in style and content
Exercise 2: Model Comparison
- Run the same prompt through 2-3 different models
- Compare response quality, speed, and style
Exercise 3: Parameter Experimentation
- Create a simple prompt
- Generate responses with different combinations of temperature and top-p
- Document which settings give the best results for your use case
Part 2: API Fundamentals
This section covers using the Ollama SDK programmatically in JavaScript/TypeScript applications.
SDK Installation
Create a new project directory and initialize:
Create a basic test file:
Basic Client Setup
Open test.js in your editor and add the following code:
Run the script:
Workshop Activity #6:
- Run the script to list available models
- Compare the output with the CLI list command result
Text Generation API
Create a new file for text generation:
Add code to generate text:
Run the script:
Parameter Reference:
Workshop Activity #7:
- Modify the generate.js script to use different parameters
- Experiment with the system parameter to guide model behavior
- Try using stop sequences to end generation at specific points
Streaming Responses
For longer responses or better user experience, streaming is recommended:
Add the following code:
Run the script:
Workshop Activity #8:
- Run the streaming example
- Modify it to include a progress indicator (e.g., token count)
- Implement a way to save the streamed output to a file
Working with Embeddings
Create a file for working with embeddings:
Add code to create and compare embeddings:
Run the script:
Workshop Activity #9:
- Add more example texts to the embeddings script
- Observe which texts have higher similarity scores
- Create a simple function to find the most similar text to a query
API Exercises
Exercise 1: Create a Simple Q&A System
- Create a script that:
- Takes a user question as input
- Generates a response using the Ollama API
- Formats and displays the answer
Exercise 2: Compare Embedding Models
- Create embeddings for the same sentences using different models
- Compare the dimensionality and similarity results
Exercise 3: Build a Simple Chat Application
- Create a script that maintains conversation history
- Send the conversation history with each new message
- Implement a simple CLI chat interface
Part 3: Advanced Topics
This section covers more advanced usage of the Ollama SDK.
Tool Calling with Models
Many advanced LLMs now support tool or function calling capabilities, which allow the model to request the execution of external functions to access data or perform actions. The Ollama SDK provides support for this functionality.
Create a new file:
Add the following code:
Run the script:
Workshop Activity #14:
- Expand the tools with additional functions (e.g., search, calculator, etc.)
- Create a streaming version of the tool calling example
- Implement actual tool execution (e.g., using a weather API or other services)
Tool Calling with the CLI
You can also use tool calling via the CLI:
Workshop Activity #15:
- Create a more complex tools JSON file with multiple functions
- Experiment with different prompts to see when tools get called
- Compare how different models use tool calling capability
OpenAI Compatibility Layer
The Ollama SDK includes an OpenAI compatibility layer that allows you to use Ollama models with code designed for the OpenAI API.
Create a new file:
Add the following code:
Run the script:
Workshop Activity #10:
- Run the OpenAI compatibility example
- Compare the response format with regular Ollama API responses
- If you have existing OpenAI code, try adapting it to use the compatibility layer
Error Handling and Retries
Create a file to demonstrate robust error handling:
Add the following code:
Run the script:
Workshop Activity #11:
- Run the error handling example
- Modify the fallback models list to match models you have installed
- Add additional error handling for different types of errors
Performance Optimization
Create a file to demonstrate performance optimizations:
Add the following code:
Run the script:
Workshop Activity #12:
- Run the performance optimization examples
- Measure the execution time of parallel vs. sequential requests
- Experiment with different buffer sizes in the streaming example
Integrating with Web Applications
Set up a basic Express.js server that uses the Ollama SDK:
Add the following code to server.js:
Create a basic frontend:
Add the following HTML:
Run the server:
Access the web interface at http://localhost:3000
Workshop Activity #13:
- Run the web server and access the interface
- Generate text responses using different models and temperatures
- Create embeddings and observe the vectors
- Modify the server to add a streaming endpoint
Advanced Exercises
Exercise 1: Create a Chat Interface with History
- Extend the web application to maintain conversation history
- Display a chat-like interface with user and assistant messages
- Implement streaming for a better user experience
Exercise 2: Build a Model Management Dashboard
- Add functionality to pull and remove models from the interface
- Display detailed model information
- Show model status and download progress
Exercise 3: Implement a Vector Database
- Store embeddings in a simple database (e.g., JSON file)
- Add a search function to find similar entries
- Create a simple knowledge base backed by embeddings
Part 4: Real-World Projects
This section contains larger project ideas to demonstrate practical applications of the Ollama SDK.
Building a Chat Application
Create a full-featured chat application:
- Multiple conversation management
- Chat history persistence
- System prompt customization
- Model switching
- Streaming responses
Creating a Document Q&A System
Build a system that can answer questions based on provided documents:
- Process documents to create chunks
- Generate embeddings for each chunk
- Store embeddings with text in a simple database
- For each query, find relevant chunks using embedding similarity
- Use relevant chunks as context for the model to generate answers
Semantic Search Implementation
Implement a semantic search system:
- Create a corpus of text documents
- Generate embeddings for all documents or paragraphs
- Create a search interface that converts queries to embeddings
- Return and highlight the most similar results
Final Project
Choose one of the project ideas above or create your own. The final project should demonstrate:
- Proper SDK usage
- Efficient handling of responses
- Good user experience
- Error handling
- Performance considerations
Conclusion
This workshop has covered the fundamentals of using the @tekimax/ollama-sdk from basic CLI usage to advanced API integrations. You now have the skills to:
- Use the CLI for quick model interactions
- Incorporate the SDK into JavaScript/TypeScript applications
- Work with text generation and embeddings
- Build real-world applications using LLMs
Remember that large language models are a rapidly evolving field. Stay updated with the latest SDK features and model capabilities to make the most of your projects.