add-image-vision

Install

View on GitHub

Quality Score: 93/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

70

Documentation 15%

100

Issue Health 10%

50

License 10%

100

Description 5%

100

Skill Content

# Image Vision Skill Adds the ability for NanoClaw agents to see and understand images sent via WhatsApp. Images are downloaded, resized with sharp, saved to the group workspace, and passed to the agent as base64-encoded multimodal content blocks. ## Phase 1: Pre-flight 1. Check if `src/image.ts` exists — skip to Phase 3 if already applied 2. Confirm `sharp` is installable (native bindings require build tools) **Prerequisite:** WhatsApp must be installed first (`skill/whatsapp` merged). This skill modifies WhatsApp channel files. ## Phase 2: Apply Code Changes ### Ensure WhatsApp fork remote ```bash git remote -v ``` If `whatsapp` is missing, add it: ```bash git remote add whatsapp https://github.com/qwibitai/nanoclaw-whatsapp.git ``` ### Merge the skill branch ```bash git fetch whatsapp skill/image-vision git merge whatsapp/skill/image-vision || { git checkout --theirs package-lock.json git add package-lock.json git merge --continue } ``` This merges in: - `src/image.ts` (image download, resize via sharp, base64 encoding) - `src/image.test.ts` (8 unit tests) - Image attachment handling in `src/channels/whatsapp.ts` - Image passing to agent in `src/index.ts` and `src/container-runner.ts` - Image content block support in `container/agent-runner/src/index.ts` - `sharp` npm dependency in `package.json` If the merge reports conflicts, resolve them by reading the conflicted files and understanding the intent of both sides. ### Validate code changes ```bash npm...

Details

Author: qwibitai
Repository: qwibitai/nanoclaw
Created: 4 months ago
Last Updated: today
Language: TypeScript
License: MIT

Install

Quality Score: 93/100

Skill Content

Details

Integrates with

Similar Skills

add-image-vision

add-voice-transcription

add-whatsapp