Search Results

Shopping News / Articles

Lablab.ai
lablab.ai > ai-hackathons > amd-developer-hackathon-act-ii > nigesh > rocm-navigator

AI app: ROCm Navigator for AMD Developer Hackathon: ACT II hackathon

2+ hour, 10+ min ago (566+ words) ROCm Navigator is an enterprise-grade, autonomous multi-agent platform engineered to break proprietary software lock-in by automating the migration of GPU applications from NVIDIA CUDA to open-source AMD ROCm (HIP). While traditional conversion tools (like regex-based hipify) frequently break on complex…...

Symbols: amd,nvda

Flowtivity
flowtivity.ai > blog > colibri-glm-52-local-inference-disk-streaming

Colibri: Run GLM-5.2 (744B MoE) on a 25GB Laptop

3+ hour, 27+ min ago (497+ words) Colibri runs GLM-5.2 on a 25GB laptop via pure-C disk streaming. Architecture, benchmarks, DGX Spark support. The key point is: Colibrì is a pure-C inference engine that runs GLM-5.2, a 744-billion-parameter Mixture-of-Experts model from Zhipu AI, on a consumer machine with…...

Symbols: moe

DEV Community
dev.to > odd_background_328 > operating-a-mesh-llm-starts-with-failure-domains-not-free-gpus-1pp7

Operating a Mesh LLM Starts With Failure Domains, Not Free GPUs

14+ hour, 22+ min ago (320+ words) Mesh LLM reached 215 points in the Hacker News snapshot I reviewed at 2026-07-12 08:00 UTC. Its author describes an OpenAI-compatible API that can run locally, route to a peer, or split model layers across multiple machines. That is an appealing use of…...

Symbols: qct,llms

Hrittik Roy
hrittikhere.com > posts > kubernetes-topology-aware-scheduling-kai

Topology-Aware GPU scheduling with KAI Scheduler

14+ hour, 36+ min ago (1574+ words) How KAI Scheduler uses topology-aware placement and gang scheduling to keep distributed GPU training inside one region, zone, or rack on Kubernetes. You have four NVIDIA A100 40 GB GPUs in one Kubernetes cluster. Two are attached to nodes in us-central1 in…...

Symbols: nvda

DEV Community
dev.to > bossandboss > i-got-99x-lower-ttft-on-a-real-android-phone-by-reusing-llamacpp-kv-state-1ngi

I Got 9.9 Lower TTFT on a Real Android Phone by Reusing llama.cpp KV State

1+ day, 54+ min ago (743+ words) Local LLM inference has an expensive habit: It recomputes prefixes it has already seen. A system prompt. A reused RAG document. A few-shot block. A long static context. If the prefix is identical, why pay the prefill cost again? That's…...

Symbols: llm,llms

DEV Community
dev.to > dramasamy > from-api-to-gpu-week-1-understanding-nvidia-dgx-spark-environment-1aol

From API to GPU, Week 1: Understanding NVIDIA DGX Spark Environment

1+ day, 1+ hour ago (1456+ words) I've used AI through APIs for years — POST a prompt, get tokens back, ship the feature. I have never once deployed a model myself. No PyTorch, no GPU memory math, no idea what actually happens between my HTTP request and…...

Symbols: nvda

Eightfold
nvidia.eightfold.ai > careers > job > 893396253752

Senior Software Engineer, CUDA C++ Core Libraries | NVIDIA Corporation

1+ day, 7+ hour ago (632+ words) NVIDIA’s accelerated computing platform is foundational to modern HPC and AI. At the center of this platform are CUDA Core Libraries that provide the algorithms, abstractions, and runtime capabilities needed to build fast, reliable, and scalable GPU-accelerated software. We are…...

Symbols: nvda

DEV Community
dev.to > abdollah_ebadi_cbec8f6471 > running-multiple-comfyui-instances-in-parallel-on-a-single-gpu-what-actually-breaks-first-4n04

Running Multiple ComfyUI Instances in Parallel on a Single GPU — What Actually Breaks First

1+ day, 8+ hour ago (847+ words) The Problem Nobody Has Measured If you have spent any time scaling ComfyUI beyond a single... Tagged with comfyui, python, machinelearning, gpu....

Symbols: mnnvl,nvda,btc-usd,nok

Blockchain.News
blockchain.news > news > nvidia-jax-llm-training-host-offloading

NVIDIA Optimizes JAX LLM Training with Host Offloading

1+ day, 23+ hour ago (367+ words) Lawrence Jengar Jul 10, 2026 18:51 NVIDIA's host offloading for JAX LLM training boosts GPU memory efficiency, enabling larger batch sizes and faster throughput. The Blackwell GPU, paired with NVIDIA’s Grace CPU, achieves up to 900 GB/s bidirectional bandwidth via NVLink-C2C. This high-speed…...

Symbols: llms,llm,fl,nvda

StartupHub.ai
startuphub.ai > startups > vulkanic > open-source-alternatives

Open Source Alternatives to Vulkanic (2026)

2+ day, 8+ hour ago (47+ words) StartupHub.ai 60 open source Agentic AI options similar to Vulkanic, ranked by named competitors first, sector overlap, and our 0-100 AI-readiness score. Each profile notes license and self-hosting where we have it. View all alternatives to Vulkanic →...

Symbols: nvda

Shopping

Please enter a search for detailed shopping results.