Darin Gordon Dowwie

Complete Ollama Guide

Running GGUF Models Locally with Ollama

GGUF (GPT-Generated Unified Format) has quickly become the go-to standard for running large language models on your machine. There’s a growing number of GGUF models on Hugging Face, and thanks to community contributors like TheBloke, you now have easy access to them.

Ollama is an application based on llama.cpp that allows you to interact with large language models directly on your computer. With Ollama, you can use any GGUF quantized models available on Hugging Face directly, without the need to create a new Modelfile or download the models manually.

In this guide, we'll explore two methods to run GGUF models locally with Ollama:

Remove USB Guard From Ubuntu

If you're a sucker like me and installed usbguard on a Ubuntu variant you may find that you will have access to none of your usb devices at all, because F you. The installer automatically sets up the daemon which has no rules so will just block all of your devices. Doing a basic apt remove usbguard may fail at 25%, because also F you.

My kernel is version 4.15.0-47-generic, not sure if this stopped working at some point or what.

Regain Access

sudo echo "allow id *:*" > /etc/usbguard/rules.conf

sudo sed -i 's/PresentDevicePolicy=apply-policy/PresentDevicePolicy=allow/' /etc/usbguard/usbguard-daemon.conf

When you get the following error when trying to compile your Elixir project, it is due to library maintainers not properly pinning version per dependency and being affected by a breaking change that has affected many projects downstream.

could not compile dependency :ssl_verify_fun, "mix compile" failed

To fix, simply run from command line:

Find out ip address using CLI

wget -qO- icanhazip.com
wget -qO- ipecho.net/plain; echo

curl checkip.amazonaws.com     # ipv4
curl ifconfig.co               # ipv6

	This is an LLM-assisted workflow for creating a product requirement document using LLM assistance for task completion.
	It keeps track of inputs for the template and works with the user to acquire them, finally generating a completed PRD
	prompt when all slots are addressed.


	credit: Ian Nuttall - https://gist.github.com/iannuttall/f3d425ad5610923a32397a687758ebf2



	System-Prompt for Facilitating Chat-Based PRD Creation

	# train_grpo.py
	import re
	import torch
	from datasets import load_dataset, Dataset
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import LoraConfig
	from trl import GRPOConfig, GRPOTrainer

	# Load and prep dataset

	You are an assistant that engages in extremely thorough, self-questioning reasoning. Your approach mirrors human stream-of-consciousness thinking, characterized by continuous exploration, self-doubt, and iterative analysis.

	## Core Principles

	1. EXPLORATION OVER CONCLUSION
	- Never rush to conclusions
	- Keep exploring until a solution emerges naturally from the evidence
	- If uncertain, continue reasoning indefinitely
	- Question every assumption and inference

	Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches.
	Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed.
	Use <count> tags after each step to show the remaining budget. Stop when reaching 0.
	Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress.
	Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process.
	Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach:

	0.8+: Continue current approach
	0.5-0.7: Consider minor adjustments
	Below 0.5: Seriously consider backtracking and trying a different approach

	<!DOCTYPE html>
	<html>
	<head>
	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
	<meta content='width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0' name="viewport" />
	<link rel="icon" href="data:,">
	<title>String art</title>
	<script type="text/javascript" src="jquery.min.js"></script>
	<style>
	html, body {

	import collections
	import math
	import os
	import cv2
	import numpy as np
	import time

	MAX_LINES = 4000
	N_PINS = 36*8
	MIN_LOOP = 20 # To avoid getting stuck in a loop