Local AI

Or: Part of my learning journey as regards Artificial Intelligence.

I cannot imagine that there is anyone capable of using a computer and the internet who has not been subjected to some Artificial Intelligence add-on. It is definitely a worthwhile endeavor to spend some time and play with AI tools – Claude, OpenAI, Gemini, CoPilot, and the like. These are highly functional, although exactly how functional and which which limitations is an interesting question, tools. Depending on your world view – these tools are heralding the end of the world as we know it with a rapid dystopian descent into the haves and have-nots just around the corner OR a utopian vision of creativity and productivity whose benefits flow broadly across society.

I have found that the models are rapidly improving but are still very capable of gross errors. At least at the moment, one should still treat them as if they are junior apprentices requiring the oversight of an experienced journeyman. How will individuals navigate the career transition between students and journeymen when the intervening space is more cost-effectively filled with AI is an interesting question.

One of the criticism of public (cloud) based models is the potential loss of privacy – your questions and perhaps data are being sent to companies, whose privacy policies are word salads combined with mandatory arbitration clauses, who are using your information in opaque ways. Which leads to the question can consumers find ways of using AI tools locally without compromising privacy and security?

Spoiler: Yes, of course – but with constraints.

On my Linux based XPS (vintage 2018 – so quite old in the scheme of things) I installed a local AI system. I will not repeat the directions but I used this post (Running AI Locally Using Ollama on Ubuntu Linux) and also following the link in that article to install Docker to allow for a clean web interface. Finally, I used Linux’s native feature to turn a webpage into a standalone application. Very cool!

The good news is that it works. The bad news, in my particular set up, it works v–e–r–y s–l–o–w–l–y. To the point of being painful. In my instance this means that the model thinks for several minutes (i.e. about 5) before starting to respond and then maybe at a speed of a roughly 20 words per minute. Did I mention that it works! Yay!

Obviously with better hardware, and in particular with Graphical Processing Unit, or better yet, a Neural Processing Unit, and more cores, memory, and so on one could expect even faster processing.

A few things I have not yet tested which is adding files or a knowledge base to see how that influences answers. Also I would like to find a more modern machine – eying my child’s gaming PC … to see how much of a performance improvement can be had. Finally looking at trying to create RAG and/or Agentic systems – but probably that will need to wait for better hardware.

All in all – a positive experience.

Be well.

Share this: