Reports To : Director of Cloud Infrastructure

About the Role

We are seeking a Senior Technical Architect to drive architecture, design, and technology strategy for our enterprise-level revenue optimization and performance management platform. The ideal candidate will bring deep experience in cloud-native architecture, distributed systems, and modern application frameworks, with proven expertise in security, scalability, integrations, and enterprise data processing pipelines.

As a senior leader, you will collaborate with engineering, DevOps, InfoSec, product, and business stakeholders to ensure our platform is resilient, secure, compliant, and scalable while supporting our roadmap for growth and innovation.

Key Responsibilities

Architecture & Design

  • Cloud-Native Architecture: Expertise in designing AWS-based cloud architectures for scalability, high availability, and cost optimization (EKS, EMR, RDS, Redshift, S3, Lambda, VPC, IAM).
  • Microservices & API Design: Strong experience in microservices architecture, service decomposition, API gateway patterns, REST, GraphQL, gRPC, and event-driven messaging (Kafka, SQS, SNS).
  • Data Architecture: Ability to design data pipelines, ETL/ELT frameworks, data lakes, data warehouses, and distributed processing systems, ensuring data quality, schema evolution, and reconciliation.
  • Security & Compliance by Design: Embed security, access control, encryption, and compliance (SOC2, PCI, GDPR, ISO 27001) into all layers of architecture.
  • Scalability & Resilience: Design systems that handle geo-distributed deployments, multi-tenancy, auto-scaling, failover, and disaster recovery.
  • Observability & Monitoring: Define logging, monitoring, alerting, and performance tuning standards for applications and infrastructure.
  • CI/CD & Deployment Architecture: Design pipelines that enforce code quality, automated testing, versioning, and secure deployments across environments.
  • Technical Documentation & Decision Records: Clearly document architecture diagrams, design decisions, trade-offs, and rationale for stakeholders and auditors.
  • Future-State Roadmapping: Ability to plan evolutionary architecture and modernization strategies, including monolith-to-microservices migration, cloud adoption, and AI/ML integration.
  • Performance & Cost Optimization: Design for efficient compute, storage, and network usage while maintaining required SLA, latency, and throughput.

Qualifications & Skills

  • 10+ years of experience in enterprise software engineering and architecture.
  • Strong expertise in AWS Cloud services including EKS, EMR, RDS, Redshift, S3, Glue, Lambda, VPC, and IAM.
  • Proven experience in microservices architecture, API design (REST, GraphQL, gRPC), and event-driven systems (Kafka, SQS, SNS).
  • Deep expertise in data pipelines, ETL/ELT, data lakes, data warehousing, and distributed processing.
  • Experience with containerization and orchestration (Docker, Kubernetes, Helm, service mesh).
  • Strong understanding of security architecture, IAM, OAuth2.0, OIDC, SAML, Auth0, and compliance frameworks (SOC2, PCI-DSS, GDPR, ISO 27001).
  • Proficiency in DevOps, CI/CD, GitOps, Jenkins, ArgoCD, GitHub Actions, and Infrastructure-as-Code (Terraform, Ansible, Pulumi).
  • Expertise in observability, monitoring, logging, tracing, and performance tuning using New Relic, OpenTelemetry, Prometheus, Grafana, ELK/EFK.
  • Extensive experience in database design and management: relational (MySQL, Postgres), NoSQL (DocumentDB, DynamoDB), and data warehouse (Redshift, Snowflake).
  • Experience designing geo-distributed, multi-tenant, high-availability, and resilient SaaS architectures.
  • Familiarity with frontend frameworks (Angular, React) ,Backend Framework (Java Spring Boot) and mobile application architecture.
  • Strong skills in architectural governance, technical debt management, and future-state roadmap planning.
  • Understanding of AI/ML workflows for anomaly detection, KPI forecasting, and optimization.
  • Expertise in cost optimization, scalability, disaster recovery, and high-performance infrastructure design.
  • Hands-on experience with software lifecycle best practices, agile methodologies, and code quality governance.
  • Ability to evaluate tools, frameworks, and platforms for strategic enterprise adoption.
  • Strong analytical, problem-solving, and decision-making skills with the ability to balance trade-offs.

Reports To : Director of Product Development

Why You’ll Love Working With Us

This isn’t just another AI job. You’ll be part of a pioneering team pushing the boundaries of what autonomous AI systems can do for Platform Engineering. You’ll have the freedom to innovate, the resources to build at scale, and the support of a collaborative, forward-thinking environment.

We are seeking a skilled d AI Agentic Engineer to join our innovative team. The ideal candidate will have a strong background in artificial intelligence, natural language processing, and software development. This role involves designing, developing, and implementing advanced RAG systems and Agentic AI based solutions that enhance user interaction and content retrieval.

What We’re Looking For

  • You eat, sleep, and breathe generative AI — prompt and context engineering, retrieval-augmented generation (RAG), and autonomous agents are your playground.
  • You know your way around modern AI agent frameworks like LangChain, LlamaIndex, Semantic Kernel, crewAI, and AutoGen — and you’re excited to push their limits.
  • Vector databases like Pinecone, Weaviate, or Chroma? You’re comfortable querying and managing them to power semantic search.
  • Full-stack skills? Absolutely. React + TypeScript on the frontend, Node.js or Python microservices on the backend, and REST or gRPC APIs.
  • DevOps savvy: Kubernetes, Terraform or AWS CDK, plus monitoring tools like Grafana and Prometheus are in your toolkit.

Key Responsibilities

  • Design and Build AI Agents: Create autonomous agents that can reason, plan, act, and collaborate.
  • Prompt Engineering: Develop advanced prompts and roles to guide agent behavior.
  • Memory & Context Integration: Implement systems for agents to store, recall, and use memory in conversations.
  • Tool & API Integration: Enable agents to use external tools and APIs for real-world automation.
  • Multi-Step Reasoning: Build agents capable of decomposing tasks, planning, and self-correcting.
  • Multi-Agent Systems: Deploy and manage groups of agents that collaborate and delegate.
  • Ecosystem Automation: Launch and maintain agentic systems that automate business and technical workflows.
  • Collaborate with data scientists and engineers to refine algorithms and improve the performance of AI models.
  • Conduct thorough testing and validation of developed systems to ensure accuracy and reliability.
  • Stay updated with industry trends and advancements in AI, machine learning, and natural language processing.

The Tech We Love

  • AI Agent & Orchestration: LangChain, LangGraph, CrewAU, LlamaIndex, Semantic Kernel, AutoGen
  • Protocol: MCP, Agent2Agent
  • Vector DBs: Pinecone, Weaviate, Chroma
  • Observability & Evaluation: LangSmith, Helicone, PromptLayer, RAGAS
  • CI/CD for LLMs: PromptOps, LlamaTest, GitHub Actions with AI evaluation workflows
  • Telephony: Twilio Programmable Voice, SIP, VAPI

If you are passionate about advancing AI technologies and have the skills to build innovative RAG and agentic systems, we encourage you to apply. Join us in shaping the future of intelligent applications!

Reports To: Director of Cloud Infrastructure

About the Role.

We’re seeking a skilled Prompt Engineer specializing in Kubernetes and platform engineering tools to design and optimize prompts that enable Large Language Models (LLMs) to automate complex container orchestration and infrastructure management tasks. You will create precise, context-rich prompts that guide AI models to generate Kubernetes manifests, manage deployments, and interact with platform engineering workflows, boosting developer productivity and operational reliability.

Your work will bridge AI and cloud-native infrastructure, enabling seamless AI-driven automation for Kubernetes clusters, Helm charts, CI/CD pipelines, and platform tooling.

What You’ll Do

  • Develop and refine prompts that instruct LLMs to generate, validate, and optimize Kubernetes YAML manifests, Helm charts, and platform automation scripts.
  • Apply advanced prompt engineering techniques such as zero-shot, few-shot, and chain-of-thought prompting tailored for infrastructure-as-code and container orchestration contexts.
  • Collaborate with DevOps, SRE, and platform engineering teams to understand deployment patterns, best practices, and pain points to craft domain-specific prompt templates.
  • Integrate prompts with AI orchestration frameworks (e.g., LangChain, AutoGen) and Kubernetes management tools to enable autonomous or semi-autonomous platform operations.
  • Continuously evaluate prompt outputs for accuracy, security, and compliance with Kubernetes best practices (e.g., pod scheduling, resource quotas, readiness/liveness probes).
  • Document prompt designs, usage guidelines, and best practices to empower platform teams and AI developers.
  • Stay up-to-date with Kubernetes ecosystem advancements and AI-driven infrastructure automation trends.

Required Skills & Experience

  • Proven experience with prompt engineering for LLMs (OpenAI GPT-4.x, Anthropic Claude, Google Gemini, etc.) especially applied to Kubernetes or cloud infrastructure automation.
  • Strong understanding of Kubernetes architecture, deployment best practices (Helm, taints/tolerations, autoscaling, probes), and platform engineering workflows.
  • Familiarity with infrastructure-as-code tools (Helm, Terraform, Kubernetes manifests) and container orchestration concepts.
  • Proficiency in Python or TypeScript for scripting and integrating AI prompts with platform tooling.
  • Experience with AI orchestration frameworks such as LangChain, AutoGen, or Semantic Kernel.
  • Knowledge of vector databases (Pinecone, Weaviate, Chroma) and semantic search to enhance prompt context retrieval.
  • Ability to craft clear, positive, and domain-specific prompts that reduce ambiguity and improve AI output quality.
  • Understanding of security and compliance considerations in cloud-native environments.

Preferred Tools & Technologies

CategoryTools & Frameworks
LLM APIsOpenAI GPT-4.x, Anthropic Claude 3.x, Google Gemini 2.5, Cohere Command R
Prompt EngineeringLangChain, AutoGen, Semantic Kernel, PromptLayer, LangSmith
Kubernetes Toolskubectl, Helm, Kustomize, Terraform
Vector DatabasesPinecone, Weaviate, Chroma
OrchestrationLangChain Agents, AutoGen, crewAI
DevOps & CloudDocker, Kubernetes, AWS, GCP, Azure, CI/CD (GitHub Actions, Jenkins)
ObservabilityPrometheus, Grafana, Kube-state-metrics

Why Join Us?

  • Work at the intersection of AI and cloud-native technologies to redefine platform automation.
  • Collaborate with experts in AI, DevOps, and platform engineering to build innovative solutions.
  • Influence the future of autonomous infrastructure management powered by prompt engineering.
  • Access to cutting-edge AI tools and continuous learning opportunities.

This role is ideal for prompt engineers passionate about Kubernetes and platform engineering who want to leverage LLMs to automate and optimize cloud infrastructure management through expert prompt design and AI orchestration.

Reports To: Director of Cloud Infrastructure

Role Overview:
Join our Core Kubernetes Operator Development team, where we’re pushing the boundaries of Kubernetes innovation. As a Kubernetes Controller Developer (Golang), you will play a crucial role in building “01”, our cloud-agnostic Platform as a Service (PaaS), driven by full-fledged Kubernetes operators and agents.

This position requires a strong background in Kubernetes internals and Golang programming, particularly in developing and managing Kubernetes controllers. If you’re a proactive problem solver with experience in building cloud-native infrastructure, this is your opportunity to contribute to a transformative platform.

We highly encourage candidates with a solid programming foundation and a hunger to explore the cloud-native world to apply. Comprehensive onboarding and professional development support will be provided.

Key Responsibilities (Not limited to):

  • Collaborate in Agile teams, taking ownership of development stories with minimal supervision.
  • Partner with internal teams and clients to accurately capture technical requirements.
  • Design, build, deploy, and maintain Kubernetes controllers and operators using Golang.
  • Identify gaps in current systems and propose or implement technical improvements.
  • Apply best practices across the full software development lifecycle.
  • Create and execute unit, regression, and E2E tests for operator reliability.
  • Work in Linux environments and troubleshoot issues in containerized applications.
  • Contribute to CI/CD workflows for seamless testing and deployment.

Essential Skillset:

  • Kubernetes Controller Development: Proven expertise in building and maintaining controllers and operators.
  • Proficiency in Golang: 2+ years writing idiomatic, well-tested Go code for Kubernetes projects.
  • Deep understanding of Kubernetes APIs and libraries including client-go, CRDs, and API extensions.
  • Hands-on experience with:
    • Kubebuilder – For scaffolding controllers and CRDs
    • Operator SDK – For building Operators with OLM support
    • controller-runtime – For abstracting Kubernetes client logic
  • Strong testing skills, including unit, load, and E2E tests for operators.
  • Familiarity with containerization (Docker) and orchestration (Kubernetes).
  • Comfortable working in Linux with debugging tools and CLI.
  • 2+ years experience working with CI/CD tools like Jenkins, GitHub Actions, Tekton, or similar.

Preferred Skills (Nice to Have):

  • CKA or CKAD certifications.
  • Hands-on experience managing production-grade Kubernetes clusters.
  • Knowledge of Infrastructure as Code tools (e.g., Terraform).
  • Exposure to major cloud providers: AWS, GCP, or Azure.
  • Scripting experience in Shell or Python.

What We Offer:

  • A chance to build infrastructure automation tools that power real-world workloads.
  • Opportunity to work on bleeding-edge cloud-native technologies with a global impact.
  • Collaborative and innovation-driven culture, with strong engineering mentorship.
  • Remote-friendly setup and flexible work culture.
  • Career development in one of the most in-demand areas of DevOps.

Get the latest BerryBytes updates by subscribing to our Newsletter!

  • Home
  • About
  • Products
  • Services
  • Careers
  • Contact