genimg-gemini-weblisted

Image generation skill using Gemini Web. Generates images from text prompts via Google Gemini. Also supports text generation. Use as the image generation backend for other skills like cover-image, xhs-images, article-illustrator.
proyecto26/sherlock-ai-plugin · ★ 27 · AI & Automation · score 77

Install: claude install-skill proyecto26/sherlock-ai-plugin

# Gemini Web Client Supports: - Text generation - Image generation (download + save) - Reference image upload (attach images for vision tasks) - Multi-turn conversations within the same executor instance (`keepSession`) - Experimental video generation (`generateVideo`) — Gemini may return an async placeholder; download might require Gemini web UI ## Quick start ```bash npx -y bun scripts/main.ts "Hello, Gemini" npx -y bun scripts/main.ts --prompt "Explain quantum computing" npx -y bun scripts/main.ts --prompt "A cute cat" --image cat.png npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.png # Multi-turn conversation (agent generates unique sessionId) npx -y bun scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123 npx -y bun scripts/main.ts "What number?" --sessionId my-unique-id-123 ``` ## Executor options (programmatic) This skill is typically consumed via `createGeminiWebExecutor(geminiOptions)` (see `scripts/executor.ts`). Key options in `GeminiWebOptions`: - `referenceImages?: string | string[]` Upload local images as references (vision input). - `keepSession?: boolean` Reuse Gemini `chatMetadata` to continue the same conversation across calls (required if you want reference images to persist across multiple messages). - `generateVideo?: string` Generate a video and (best-effort) download to the given path. Gemini may return `video_gen_chip` (async); in that case you must open Gemini web UI to download the result. Notes: - `gene