San Jose, Kaliforniya, Birleşik Devletler
5 B takipçi 500+ bağlantı

Profili görüntülemek için katılın

Hakkında

Founding Principal Architect of CS²B Technologies Inc (🔗…

Hizmetler

Subramaniyam Venkata Pooni adlı kullanıcıya ait yazılar

Faaliyet

5 B takipçi

See all activities

Deneyim ve Eğitim

  • CS²B TECHNOLOGIES INC

Subramaniyam Venkata Pooni adlı kişinin tam deneyimin görüntüleyin

Unvan, işte kalma süresi ve daha fazlasını görün.

veya

Devam Et’i tıklayarak veya oturum açarak LinkedIn Kullanıcı Anlaşması’nı, Gizlilik Politikası’nı ve Çerez Politikası’nı kabul edersiniz.

Lisanslar ve Sertifikalar

Patentler

Kurslar

  • Advanced Programming with Python By david Beazley

    -

  • Functional programming in Scala by John De Goes

    -

  • Implementing a Raft Consensus Algorithm

    -

  • The art of Functional Design by John De Goes

    -

  • Write a Compiler (in Python) by David Beazley

    -

Projeler

  • 🚀Crusty Lox Interpreter in Rust [Based on Crafting Interpreters by Bob Nystrom]

    - Halen

    Description:
    Reimplemented a tree-walking interpreter for the Lox language using idiomatic Rust, emphasizing functional programming techniques and immutability over OOP. The project explores compiler construction fundamentals: lexing, parsing, AST generation, and interpretation with proper error handling and runtime environments.

    Key Features:
    Scanner / Lexer: Tokenizes Lox source using Rust string manipulation and pattern matching
    Parser: Constructs ASTs from token streams with…

    Description:
    Reimplemented a tree-walking interpreter for the Lox language using idiomatic Rust, emphasizing functional programming techniques and immutability over OOP. The project explores compiler construction fundamentals: lexing, parsing, AST generation, and interpretation with proper error handling and runtime environments.

    Key Features:
    Scanner / Lexer: Tokenizes Lox source using Rust string manipulation and pattern matching
    Parser: Constructs ASTs from token streams with recursive descent techniques
    Evaluator: Implements tree-walk evaluation supporting expressions, variables, functions, and control flow
    Environment: Mutable runtime state using nested HashMap environments (scope chains)
    Project Architecture: Modular design using idiomatic Rust crates and cargo-based organization

    Rust Skills Developed:

    Ownership and lifetimes in recursive data structures
    Enums, pattern matching, and traits
    Functional thinking and iterator combinators
    Modularization and test-driven development

    References:
    Repo: github.com/SamPooni/crusty_interpreter[private]
    Based on Crafting Interpreters [Python-based evaluator → translated to idiomatic Rust]

  • LLMOps Frameworks | Prompt Engineering | RAG Observability

    -

    ✅ Led design of proprietary prompt-based LLM finetuning pipelines for Mistral-7B, Phi-2, and Gemma models, leveraging RLHF-based pairwise prompt evaluations. Integrated runtime observability for every step—from embedding generation to output hallucination scoring.
    ✅ Designed CI/CD workflows for chunking, embedding, inference retry, and telemetry rollback via GitLab + Argo-based pipelines. Benchmarked structured prompt extractors with FastAPI, LangChain, and pgvector-backed retrieval…

    ✅ Led design of proprietary prompt-based LLM finetuning pipelines for Mistral-7B, Phi-2, and Gemma models, leveraging RLHF-based pairwise prompt evaluations. Integrated runtime observability for every step—from embedding generation to output hallucination scoring.
    ✅ Designed CI/CD workflows for chunking, embedding, inference retry, and telemetry rollback via GitLab + Argo-based pipelines. Benchmarked structured prompt extractors with FastAPI, LangChain, and pgvector-backed retrieval systems.
    ✅ Delivered GPU-aware observability modules to trace vRAM fragmentation, token throughput, and latency spikes—instrumented using Prometheus exporters and real-time dashboards in Grafana.
    ✅ Introduced model versioning logic and signature hash matching to automate compatibility and rollback decisions during deployment of multi-tenant RAG-based GenAI systems.

    Key KPIs & Business Impact: Led prompt-based fine-tuning of LLMs like Mistral-7B, Phi-2, and Gemma, achieving a 30% improvement in response consistency using RLHF-driven evaluation pipelines. Built 4× faster CI/CD workflows for chunking, embedding, inference retries, and telemetry rollback using GitLab and Argo. Delivered 100% runtime traceability of token throughput, vRAM fragmentation, and latency spikes through Prometheus and Grafana. Introduced automated versioning and signature-based hash matching to enable zero-downtime rollback, while inference retry mechanisms maintained <10s recovery time, enhancing reliability of multi-tenant RAG deployments.

  • AI Performance Engineering | HPC Systems | LLM Inference Optimization

    -

    ✅ Currently optimizing AI performance for both training and inference workloads across large-scale LLMs such as Llama 3.1, Llama 2–70B, Mixtral, BERT, ResNet, 3D U-Net, and Stable Diffusion using high-performance Supermicro and Dell GPU clusters. Tuning involves TensorRT, vLLM, Triton, and vCUDA workflows, aligned with BIOS, NUMA, storage tiering, and memory management optimizations.
    ✅ Custom performance tuning on NVIDIA (B200, GH200, H100), AMD (MI350X, MI325X, MI300X), and Intel Xeon…

    ✅ Currently optimizing AI performance for both training and inference workloads across large-scale LLMs such as Llama 3.1, Llama 2–70B, Mixtral, BERT, ResNet, 3D U-Net, and Stable Diffusion using high-performance Supermicro and Dell GPU clusters. Tuning involves TensorRT, vLLM, Triton, and vCUDA workflows, aligned with BIOS, NUMA, storage tiering, and memory management optimizations.
    ✅ Custom performance tuning on NVIDIA (B200, GH200, H100), AMD (MI350X, MI325X, MI300X), and Intel Xeon platforms, with submitted MLPerf benchmarks demonstrating up to 3x model throughput gains over baseline.
    ✅ Integrated end-to-end performance tracing pipelines using Prometheus, OpenSearch, and Grafana for AI inference and training workloads, enabling real-time observability on GPU utilization, batch latencies, memory fragmentation, and inference token rates.
    ✅ Developed prompt optimization and RLHF pipelines for LLM evaluation and fine-tuning using LoRA/PEFT, including architecture comparisons between prompt-based tuning and RAG-based grounding with vector retrieval systems.
    ✅ Engineered AI workload bring-up flows from scratch, including prompt injection, self-healing batch retries, model profiling, GPU saturation detection, and fine-tuning with hardware-aware compilers (ONNX, MLIR).

    Key KPIs & Business Impact: Delivered up to 3× throughput improvements in MLPerf benchmarks across NVIDIA (B200, GH200, H100) and AMD (MI350X, MI300X) platforms through BIOS, NUMA, and batch-size tuning. Achieved 40% reduction in latency jitter and 2× faster LLM fine-tuning using LoRA with ONNX and MLIR-based compilers. Engineered full-stack automation to bring up models in under 2 minutes, and implemented self-healing inference pipelines with a 95% success rate. Deployed real-time GPU observability using Prometheus, Grafana, and OpenSearch, enabling 100% visibility into token rates, memory fragmentation, and GPU utilization.

  • AIaaS for CSPs Expertise

    -

    ✅GenAI Platform Automation: Enabled delivery of VMware Private AI Foundation with NVIDIA (PAIF-N) to empower Cloud Service Providers (CSPs) to monetize Generative AI with a production-ready, multi-tenant platform.
    ✅AIaaS Monetization Layers: GPU as a Service (GaaS) — GPU-backed VM rental, AI PaaS — Pre-configured DL environments (Jupyter, PyTorch, Triton), Model-as-a-Service (MaaS) — Hosted inference APIs (LLaMA, Falcon, Mixtral), AI Applications — Custom chatbots, document agents, RAG…

    ✅GenAI Platform Automation: Enabled delivery of VMware Private AI Foundation with NVIDIA (PAIF-N) to empower Cloud Service Providers (CSPs) to monetize Generative AI with a production-ready, multi-tenant platform.
    ✅AIaaS Monetization Layers: GPU as a Service (GaaS) — GPU-backed VM rental, AI PaaS — Pre-configured DL environments (Jupyter, PyTorch, Triton), Model-as-a-Service (MaaS) — Hosted inference APIs (LLaMA, Falcon, Mixtral), AI Applications — Custom chatbots, document agents, RAG assistants
    ✅RAG Stack CI/CD: Authored SDKs and pipelines for chunking, embeddings, inference retries, rollback, telemetry

    Key KPIs & Business Impact: less than 2 min to deploy full LLM + RAG pipeline, ~40% margins vs. 10–15% for traditional IaaS, Zero-touch provisioning of GPU-backed AI Workstations,100% model traceability via GitLab + Harbor, 95%+ developer adoption of platform tools, 3x faster time-to-revenue via pre-integrated GenAI stack

  • LLM Agent Programming

    -

    ✅ Engineered modular agents using LangChain, Triton, FastAPI, and pgvector for RAG-backed assistants and structured extractors.

    Key KPIs & Business Impact: 95%+ task success, less than 2s latency, 80%+ tool accuracy, 50% fewer hallucinations, 3× code reuse, 10K+ tool calls/week, full observability via logging/tracing

  • VMware Aria Automation | Multi-Cloud IaC & CI/CD | Hybrid Cloud

    -

    ✅Designed and deployed advanced automation using VMware Aria Automation 8.x, with high availability (HA) clustering, SAML/LDAP-based identity integration, and RBAC controls. Architected self-service hybrid cloud platforms across VCF, AWS, Azure, and GCP using Aria Service Broker and dynamic NSX-T policies.
    ✅Enabled full observability using Aria Operations, authoring lightweight collectors and telemetry agents in C++ and Python for performance metrics

    Key KPIs & Business Impact: Drove…

    ✅Designed and deployed advanced automation using VMware Aria Automation 8.x, with high availability (HA) clustering, SAML/LDAP-based identity integration, and RBAC controls. Architected self-service hybrid cloud platforms across VCF, AWS, Azure, and GCP using Aria Service Broker and dynamic NSX-T policies.
    ✅Enabled full observability using Aria Operations, authoring lightweight collectors and telemetry agents in C++ and Python for performance metrics

    Key KPIs & Business Impact: Drove CI/CD automation, multi-vCenter orchestration, and post-sales technical governance across Tier-1 telco operators in the Americas $50M+ SOWs influenced, 100% 5G Open RAN success, 30%+ faster TTM, 20+ exec sessions, 50+ CI/CD flows, 50+ CNFs onboard, 90%+ SLA compliance, 3× adoption growth, 5+ roadmap wins, less than 10% post-deploy issue

  • Compiler for WebAssembly based AI Edge Inference (Python, LLVM, MLIR)

    -

    ✅Designed and implemented a custom MLIR-based compiler in Python targeting WebAssembly (WASM) for browser-native ML inference with near-native performance.

    ✅Created a statically typed, C-like DSL supporting LLVM IR and WASM backends, enabling developers to write edge ML logic that compiles to highly efficient bytecode.

    ✅Delivered a full compiler toolchain: front-end parser, transpiler to C, LLVM codegen, and a runtime interpreter, achieving performance comparable to hand-tuned…

    ✅Designed and implemented a custom MLIR-based compiler in Python targeting WebAssembly (WASM) for browser-native ML inference with near-native performance.

    ✅Created a statically typed, C-like DSL supporting LLVM IR and WASM backends, enabling developers to write edge ML logic that compiles to highly efficient bytecode.

    ✅Delivered a full compiler toolchain: front-end parser, transpiler to C, LLVM codegen, and a runtime interpreter, achieving performance comparable to hand-tuned C++.

    ✅Use cases include IoT analytics, offline AI agents, and real-time edge inferencing in constrained environments like browsers and embedded devices.


    References:
    Repo: https://github.com/SamPooni/compilers [private]

  • 🚀Scalable Distributed ML Parameter Server (Python, Asyncio, Raft)

    -

    ✅Engineered a high-performance distributed Parameter Server in Python for scalable ML training across multi-node clusters, supporting tens of thousands of parameter updates per second.

    ✅Architected a fault-tolerant key-value store with Raft-based consensus for leader election, replication, and dynamic node membership, ensuring strong consistency and high availability.

    ✅Integrated priority-aware scheduling for gradient aggregation and asynchronous messaging via asyncio, improving…

    ✅Engineered a high-performance distributed Parameter Server in Python for scalable ML training across multi-node clusters, supporting tens of thousands of parameter updates per second.

    ✅Architected a fault-tolerant key-value store with Raft-based consensus for leader election, replication, and dynamic node membership, ensuring strong consistency and high availability.

    ✅Integrated priority-aware scheduling for gradient aggregation and asynchronous messaging via asyncio, improving inter-node throughput by 40%+ under peak load.

    ✅Designed for seamless integration with PyTorch/TensorFlow training loops and extensible for federated learning or reinforcement learning workloads.

    References:
    Repo: https://github.com/SamPooni/pyraft [private]

  • Java Language new Features Experimentation

    -

    1. Design Patterns: Applying Powerful Design Ideas
    2. Scala Essentials: The Intriguing Parts
    3. Functional Programming in Java: Creating Maintainable Code
    4. Java Modules: From Legacy to Modularized Code
    5. The New Java: Languages and JDK Features from 9 to 14

  • Applied Research & Development in Software Design Patterns and Testing Frameworks in Python

    -

    Designed and developed a modular, high-impact software engineering framework focused on pragmatic, scalable programming practices. The project explored how to build complex systems by focusing on composability, interface design, layered abstractions, and testable architecture—rather than language or framework specifics. It integrated functional, object-oriented, and event-driven paradigms into a cohesive design philosophy.

    The Core Components the project concentrates on are: Data…

    Designed and developed a modular, high-impact software engineering framework focused on pragmatic, scalable programming practices. The project explored how to build complex systems by focusing on composability, interface design, layered abstractions, and testable architecture—rather than language or framework specifics. It integrated functional, object-oriented, and event-driven paradigms into a cohesive design philosophy.

    The Core Components the project concentrates on are: Data Abstraction Layer, Interface Contracts,Compositional Class Architectures, Reactive Event Systems, Functional Primitives, Verification & Test Harness and Problem-Driven Design Process


    References:
    Repo: https://github.com/SamPooni/advanced_python_programming[private]

Onurlar ve Ödüller

  • Certificate of Outstanding Contributions and Innovation

    Huawei, New Jersey Research center

    In recognition of outstanding contributions to AI in the wireless space

Alınan tavsiyeler

Subramaniyam Venkata Pooni adlı üyenin tam profilini görüntüleyin

  • Ortak tanıdıklarınızı görün
  • Başka biri aracılığıyla tanış
  • Subramaniyam Venkata Pooni ile doğrudan iletişime geçin
Tam profili görüntülemek için katılın

Diğer benzer profiller