VJ UNION

Cover image for Open Source AI updates for Visual artists
Sleepless Monk
Sleepless Monk

Posted on

Open Source AI updates for Visual artists

Some serious updates on open source AI while OpenAI, Runway and Suno have been unveiling their new models and making deals with content providers and streaming services to overcome copyright issues.

Lets start out with cogvideox, perhaps the best open source video model to date, there are various ways to run it including comfyui nodes

GitHub logo zai-org / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

CogVideo & CogVideoX

中文阅读

日本語で読む

Experience the CogVideoX-5B model online at 🤗 Huggingface Space or 🤖 ModelScope Space

📚 View the paper and user guide

👋 Join our WeChat and Discord

📍 Visit QingYing and API Platform to experience larger-scale commercial video generation models

Project Updates

  • 🔥🔥 News: 2025/03/24: We have launched CogKit, a fine-tuning and inference framework for the CogView4 and CogVideoX series. This toolkit allows you to fully explore and utilize our multimodal generation models.
  • 🔥 News: 2025/02/28: DDIM Inverse is now supported in CogVideoX-5B and CogVideoX1.5-5B. Check here.
  • 🔥 News: 2025/01/08: We have updated the code for Lora fine-tuning based on the diffusers version model, which uses less GPU memory. For more details, please see here.
  • 🔥 News: 2024/11/15: We released the CogVideoX1.5 model in the diffusers version. Only minor parameter adjustments are needed to…

Temporal lab is a python based video suite that combines cogvideox with ollama to provide integrated LLM and video generation services for filmmakers, artists, etc

GitHub logo TemporalLabsLLC-SOL / TemporalPromptEngine

A comprehensive, click to install, fully open-source, Video + Audio Generation AIO Toolkit using advanced prompt engineering plus the power of CogVideox + AudioLDM2 + Python!

Temporal Prompt Engine: Local, Open-Source, Intuitive, Cinematic Prompt Engine + Video and Audio Generation Suite for Nvidia GPUs

##NOW FEATURING custom 12b HunYuanVideo script with Incorporated MMAudio for 80gb cards.

Screenshot-1050 Screenshot-1049 Screenshot-1048 Screenshot-1060 Screenshot-1061 Screenshot-1062 Screenshot-1063 Screenshot-1064

##MASSIVE UPDATE TO INSTRUCTIONS BELOW COMING VERY SOON (12/11/2024)

I am looking for a volunteer assistant if you're interested reach out at [email protected] - This is going to a webapp version VERY soon.

Table of Contents

  1. Introduction
  2. Features Overview
  3. Installation
  4. Quick Start Guide
  5. API Key Setup
  6. Story Mode: Unleash Epic Narratives
  7. Inspirational Use Cases
  8. Harnessing the Power of ComfyUI
  9. Local Video Generation Using CogVideo
  10. Join the Temporal Labs Journey
  11. Donations and Support
  12. Additional Services Offered
  13. Attribution and Courtesy Request
  14. Contact
  15. Acknowledgments


Introduction

Welcome to the Temporal Prompt Engine a comprehensive framework for building out batch variations or story sequences for video prompt generators. This idea was original started as a comfyUI workflow for CogVideoX but has since evolved into…




This is not new news but I recently tested invoke AI, it is an interesting alternative for those who are looking for the modular aspects of comfyui without the complexity
https://invoke-ai.github.io/InvokeAI/installation/installer/#running-the-installer

Finally some cutting edge new generation Image models, the long awaited SD 3.5 and the Flux model by their competitors.

https://comfyanonymous.github.io/ComfyUI_examples/sd3/?ref=blog.comfy.org

https://stable-diffusion-art.com/flux-comfyui/

OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It is designed to be simple, flexible, and easy to use. We provide inference code so that everyone can explore more functionalities of OmniGen.
Existing image generation models often require loading several additional network modules (such as ControlNet, IP-Adapter, Reference-Net, etc.) and performing extra preprocessing steps (e.g., face detection, pose estimation, cropping, etc.) to generate a satisfactory image. However, we believe that the future image generation paradigm should be more simple and flexible, that is, generating various images directly through arbitrarily multi-modal instructions without the need for additional plugins and operations, similar to how GPT works in language generation.

GitHub logo VectorSpaceLab / OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

OmniGen: Unified Image Generation

Build Build License Build Build

We are hiring FTE researchers and interns! If you are interested in working with us on Vision Generation Models, please concat us: [email protected]!

1. News

2. Overview

OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It is…




In the rapidly evolving landscape of Artificial General Intelligence (AGI), the emergence of Florence-2 signifies a monumental stride forward in the realm of computer vision. Developed by a team at Azure AI, Microsoft, this state-of-the-art vision foundation model aims to redefine the way machines comprehend and interpret visual data. Let's delve into this groundbreaking advancement and explore how Florence-2 is poised to revolutionize the field of AI. (you have nodes in comfyui to use florence)
https://www.labellerr.com/blog/florence-2-vision-model-by-microsoft/

On a closing note, let me introduce again stability matrix, a one system to manage all your AI art needs (it can install various interfaces like sdwebui and its variants forge,SDnext, comfyui, invokeai, foocus, swarmui, onetrainer ), it works mostly great, Im only running invokeai separately as of now. Makes easy keeping track of things, hopefully it will integrate some llm solutions, NERF, TD UE plugins too in future as a full multimodal system though as of now these need to installed separately

GitHub logo LykosAI / StabilityMatrix

Multi-Platform Package Manager for Stable Diffusion

Stability Matrix

Build Discord Server

Latest Stable Latest Preview Latest Dev

Header image for Stability Matrix, Multi-Platform Package Manager and Inference UI for Stable Diffusion

Windows Linux (AppImage) Arch Linux (AUR) macOS

Multi-Platform Package Manager and Inference UI for Stable Diffusion

🖱️ One click install and update for Stable Diffusion Web UI Packages

✨ Inference - A Reimagined Interface for Stable

Discussion (0)