@ba_niu80557: https://x.com/ba_niu80557/status/2069042546886787419

X AI KOLs Timeline 06/22/26, 12:59 PM News

Summary

This article explores the true meaning of Forward Deployed Engineering (FDE) in AI deployment, emphasizing that FDE is not simply about API calls or building agents, but rather a systematic engineering approach geared toward production deployment, including business translation, system design, platform integration, production operations, and capability accumulation.

https://t.co/7EiGRkViC9

Original Article

View Cached Full Text

Cached at: 06/23/26, 06:00 AM

Knowing How to Call an API Doesn’t Make You an FDE, and Knowing How to Build an Agent Doesn’t Either: A Deep Dive into What FDE Really Means in AI Deployment

▍Abstract

If we interpret FDE in this article as Forward Deployed Engineering, it is not simply “a full-stack engineer who knows how to call model APIs.” It is a delivery and engineering operating model oriented toward production deployment: engineers embedded directly within business teams, working on problem definition, technology selection, system integration, launch, adoption, and feedback loops around real workflows, and then abstracting field experience into reusable platform capabilities. OpenAI’s official description of FDE is very clear: its responsibilities are turning frontier models into production systems, handling discovery, technical scoping, system design, building, production rollout, and measuring success by adoption rates, measurable workflow impacts, and eval-driven feedback. In 2026, OpenAI further organized this model as the Deployment Company. Anthropic, in its collaboration with DXC, has also expanded “forward deployed engineers focused on customer deployments” to larger-scale enterprise transformation scenarios.

This is also why, when discussing FDE today, we cannot only talk about “Agent orchestration” or “RAG with knowledge bases.” Real AI/ML production systems simultaneously bear three axes of change: code, data, and models. Martin Fowler’s CD4ML explicitly states that machine learning applications must manage code, data, and models as reproducible artifacts together. Google’s classic paper “Hidden Technical Debt in Machine Learning Systems” reminds us that the major difficulty of ML systems lies not in the model itself, but in coupling, dependencies, feedback loops, and invisible technical debt. The core value of FDE is to consolidate these scattered issues into a deliverable, observable, governable, and scalable system.

For technical practitioners and engineering leaders, the most reliable FDE reference stack is not “chasing the latest demo,” but combining: stateful agents, deterministic workflows, model registries and evaluation, feature/data layers, GitOps, observability, data contracts, and security controls. Use LangGraph or the OpenAI Agents SDK to manage long-running tasks and human-in-the-loop; use Airflow, Prefect, or Dagster for deterministic pipelines; use MLflow for experiment management, evaluation, and model lifecycle; use Feast for training/inference feature consistency; use KServe or BentoML for inference serving; use GitHub Actions + Argo CD for CI/CD and GitOps; use OpenTelemetry for unified traces, metrics, and logs; use dbt contracts and data quality checks to control upstream drift.

My overall judgment is straightforward: FDE is not a “premium job title” within an AI team, but the last mile of systems engineering that turns model capabilities into long-term business output. In 2026, the ceiling of this role is no longer “building a chatbot,” but rewriting workflows, restructuring data paths, embedding control planes, quantifying ROI, and platformizing field tactics.

▍Definition and Boundaries of FDE

In the context of this article, FDE focuses on AI/ML production deployment, not traditional full-stack web development, nor data engineering focused solely on data warehouses and ETL. OpenAI’s official role description places FDE at the intersection of “customer delivery” and “core platform development”: they must work closely with customers and domain teams while abstracting field tactics into tools, playbooks, and building blocks. The government FDE description further emphasizes that observable systems must cover the entire chain from infrastructure to application.

This definition is highly consistent with Palantir’s narrative of forward deployed/AI deployment. On its Baseline page, Palantir emphasizes that its AIP can be deployed across multi-cloud, on-premise, and government networks. Palantir’s AI FDE documentation places permission inheritance, minimal context exposure, and adherence to existing authorization systems at the core. This means the “boundary” of FDE naturally extends through the application layer to network, identity, permissions, audit, data paths, and environment isolation. In other words, someone who only knows how to write prompts but cannot design permission boundaries and deployment topologies is not a production-grade FDE. Someone who only knows Kubernetes but does not understand business processes and user adoption is not a complete FDE either.

Therefore, FDE is more of a responsibility combination than a single technical label. It bears at least five types of responsibilities:
First, business translation: converging vague requirements into workflows that can be deployed and measured.
Second, system design: deciding architectures such as single-agent, multi-agent, RAG, batch/stream hybrid, online/offline division.
Third, platform integration: connecting models to data, tools, IAM, approval processes, and monitoring.
Fourth, production operations: handling eval, canary deployments, rollbacks, alerts, capacity, and cost governance.
Fifth, capability consolidation: turning one-off deliveries into organizational templates, SDKs, components, and best practices. OpenAI’s role descriptions, the Deployment Company article, and the FDSWE page all emphasize this positive feedback loop of “field signal → platform abstraction.”

If I must give engineering leaders the most practical criterion, I would define it this way: FDE is the person ultimately accountable for “high-value AI workflows.” The output they are responsible for is not a demo, but a stable system after launch, auditable processes, scalable operations, and continuously improvable metrics.

▍Architecture Patterns and Key Components

The correct architectural view of FDE in AI/ML production is not “make everything an agent,” but first decompose the system into a control plane and an execution plane. The control plane handles planning, routing, policies, evaluation, versioning, and human intervention. The execution plane handles retrieval, feature reading, model inference, tool calls, and transactional actions. LangGraph models agent workflows as graphs and emphasizes long-running operations, persistence, fault recovery, and human-in-the-loop. The OpenAI Agents SDK also treats state, results, guardrails, human review, and tracing as first-class citizens of the runtime. For production systems, this indicates that agents are no longer just “conversation loops,” but an application runtime requiring state machines, audit trails, and approval gates.

At the same time, the agent layer must not be mistaken for the entire platform. Deterministic data and model flows still need dedicated orchestrators: Airflow defines itself as a platform to programmatically author, schedule, and monitor batch-oriented workflows; Prefect emphasizes turning Python functions directly into production data pipelines; Dagster places “software-defined assets” and asset checks at the center, suitable for modeling data products and quality checks as assets rather than scripts. For FDE, the most common hybrid pattern is: use agents for uncertain interactions, use pipelines for deterministic processing and backfilling.

The data layer must also be layered. Feast’s documentation explicitly states its goal is to provide production-grade feature serving for training and inference, supporting low-latency reads through an online store; dbt’s model contracts require the output dataset shape, field names, and types to strictly match YAML definitions, failing builds otherwise; Great Expectations and Dagster asset checks make validation and asset property tests explicit. For FDE, this set of capabilities collectively solves a very common but often underestimated problem: business teams see “the model answered incorrectly,” but what engineering teams really need to investigate are often schema changes, missing features, freshness issues, or training/inference inconsistencies.

The serving layer also cannot focus solely on the model itself. KServe positions itself as a unified Kubernetes AI inference platform, emphasizing cloud-agnostic, Generative + Predictive AI on the same stack, canary deployments, auto-scaling based on token throughput/queue depth/GPU utilization, and the Open Inference Protocol. BentoML leans more toward application development and API exposure, on one hand supporting vLLM/OpenAI-compatible endpoints, and on the other hand providing examples for LangGraph, function calling, and multi-model composition. A mature FDE team usually makes a clear division here: KServe is more suitable for platform teams to provide a unified inference base, while BentoML is more suitable for application teams to quickly package compound AI services.

The following diagram can clarify the “minimum viable full stack” of FDE production architecture:

This diagram does not reflect any single vendor’s product map, but the key dependencies FDE repeatedly encounters in production: code, data, models, tools, permissions, and observability. It aligns with CD4ML’s emphasis on the three axes of code/data/model, LangGraph/OpenAI’s requirements for stateful agents, and KServe/Feast/dbt/OTel’s definitions for serving, features, contracts, and observability.

In pattern selection, a particularly common pitfall is “premature multi-agent architecture.” In OpenAI’s public technical sharing about Klarna, one important lesson from Klarna’s customer service system was: first use a single agent + strong retrieval/strong routing to accurately select a large number of routines, instead of building a layered agent hierarchy from the start. Only when problems are naturally decomposable, tool context is too broad, or multiple parallel sub-tasks clearly benefit, should multi-agent be introduced. This experience aligns well with LangGraph’s positioning as a “low-level orchestration framework”: first draw the graph clearly, then decide whether to increase the number of agents.

▍Engineering Implementation, Open Source Stack, and Tool Selection

In engineering implementation, I suggest compressing FDE technology decisions into three questions: Where to deploy? Who holds the state? Who controls the release?
If data is sensitive, sovereignty requirements are high, and the team already has Kubernetes/SRE capabilities, the most reliable path is usually to build a Kubernetes base, use Deployments/HPAs for stateless service elasticity, expose a unified inference interface with KServe, and use Argo CD for declarative GitOps releases. The official Kubernetes documentation clearly states that Deployments manage declarative updates, and HPAs dynamically adjust replica counts based on CPU, memory, or custom metrics. Argo CD defines itself as a declarative GitOps CD tool, offering multi-tenant installation modes commonly used by multiple teams.

If the team prioritizes speed over platform autonomy, managed platforms are usually faster: Amazon SageMaker AI integrates training, customization, inference, governance, observability, and managed MLflow into one unified service; Azure Machine Learning provides end-to-end ML lifecycle management, registries, and MLOps; in 2026, Google Cloud evolved Vertex AI into the Gemini Enterprise Agent Platform, emphasizing building, scaling, governing, and optimizing enterprise-grade agents. The real trade-off here is not “which has the most features,” but whether your team is willing to hand over runtime, network, identity boundaries, and orchestration control to the cloud provider in exchange for shorter delivery cycles.

A third path closer to FDE field reality is hybrid control/data plane: model or agent orchestration can run on managed or unified control planes, but data and transactional execution remain within dedicated networks, private VPCs, on-premise environments, or government networks. Palantir’s Baseline statements about multi-cloud, on-premise, and government network deployments, along with OpenAI’s government FDE emphasis on secure, compliant deployments, all highlight the necessity of this combination in the real world.

Below are two minimal snippets that resemble production. They are not direct copies of official docs but simplified adaptations based on official capability boundaries.

A minimal YAML for KServe inference serving:

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: risk-agent-model
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: s3://ml-models/risk-agent/v1/

This direction comes from KServe’s InferenceService and its design for “minutes to production / canary / autoscaling / enterprise-scale ready.” In real production, you would typically also add traffic split, resource limits, probes, image signatures, and network policies.

A minimal YAML for dbt data contracts:

models:
  - name: feature_customer_health_v1
    config:
      contract:
        enforced: true
    columns:
      - name: customer_id
        data_type: string
      - name: churn_risk_score
        data_type: float

The dbt documentation clearly states that an enforced contract forces the model’s returned dataset to exactly match the fields and types defined in the YAML; otherwise, the build fails. Such contracts are ideal for FDE teams as release gates, because they turn “data structure changes” from implicit production incidents into explicit build failures.

If your application is more like “an API-fied compound AI service,” BentoML is lighter than deploying KServe from the start. Its official examples demonstrate a vLLM-based OpenAI-compatible LLM API, and also provide templates for LangGraph agents, function calling, and GitHub Actions CI/CD to BentoCloud. This approach is especially suitable for FDE teams to quickly bundle models, tools, and business APIs into a deployable service on the customer front line.

The following table is a pragmatic selection framework I offer to engineering leaders:

Platform or Combination	Best Suited For	Key Capabilities	Main Advantages	Main Constraints	Cost Signal
Amazon SageMaker AI	Teams wanting an “all-in-one cloud AI platform” quickly	Training, customization, inference, governance, observability, managed MLflow all in one unified service	High degree of managed service, suitable for fast experiment-to-prod	Stronger cloud lock-in; complex networking and cross-environment governance must follow AWS patterns	Official: pay-per-use, with Savings Plans for cost reduction
Azure Machine Learning	Teams already heavily using Azure, needing Registry/MLOps/enterprise governance	End-to-end ML lifecycle, model registry, environment versioning, cross-dev/test/prod registries	High integration with Microsoft enterprise stack; smooth for traditional ML + GenAI ops	Heavy system with many abstractions; steep learning curve for new teams	Official: no additional service fee for Azure ML itself, but incurs costs for compute, storage, Key Vault, Container Registry, Application Insights, etc.
Gemini Enterprise Agent Platform	Teams focused on enterprise agents, Google ecosystem data, and global scale deployment	Build, scale, govern, optimize enterprise-grade agents, access to many foundation models	Most complete narrative for agent scenarios; clear model/platform/governance integration	Platform naming and product boundaries still evolving in 2026; verify specific capability ownership before purchasing	Official: pricing calculator, custom quote, credits for new customers; overall typical cloud usage-based signal
Open-source self-built combo: Kubeflow + KServe + MLflow + Feast + Argo CD + OTel + dbt	Teams requiring high control, private networks, sovereignty, cross-cloud portability	Kubeflow as AI platform foundation; KServe for inference; MLflow for experiments/registry; Feast for features; Argo CD for GitOps; OTel for observability; dbt for contracts	Highest controllability, best for consolidating organizational FDE templates	Highest platform engineering requirements; must manage SRE/security/upgrades/cost governance	Low software license cost; real cost is mainly Kubernetes, object storage, GPU/CPU, logging, and team ops labor

If I can give one very practical piece of advice: In the first phase, don’t pursue a “comprehensive unified platform.” First build the shortest closed loop for one high-value workflow. In the second phase, abstract the proven control plane into a platform. This is exactly how FDE works, not how traditional platform teams first build a “grand unified platform” and then find business use cases.

▍Case Studies and Lessons Learned

Morgan Stanley

The public case of Morgan Stanley and OpenAI is a good financial industry sample for understanding FDE. The OpenAI official page mentions that before deployment, Morgan Stanley put each AI use case through an evaluation framework, letting expert feedback continuously enter the improvement loop. Its AI @ Morgan Stanley Assistant has been adopted by more than 98% of its advisor teams for internal knowledge retrieval and response support. An earlier OpenAI case summary also mentioned that advisor access to knowledge increased from 20% to 80%. For scenarios like finance, healthcare, and government, the most important insight from this case is not “they used GPT-4,” but that eval and controls came first, followed by large-scale adoption.

Klarna

Klarna is the most frequently cited and most easily misunderstood case for FDE production deployment. OpenAI’s official customer case states that Klarna’s AI assistant handled 2.3 million conversations in its first month, taking over two-thirds of customer service chat volume, equivalent to the workload of about 700 full-time agents. The average resolution time dropped from 11 minutes to under 2 minutes, repeat inquiries decreased by 25%, and satisfaction remained on par with human agents. Even more valuable is the architectural insight from OpenAI’s technical sharing: Klarna did not win with a complex hierarchy of agents, but by using a single agent + very strong routine retrieval/intent routing, accurately selecting “which process to follow” as early and as precisely as possible from a large set of routines. The conclusion for FDE is very clear: complexity should be prioritized in routing and process modeling, not in the number of agents.

Uber

Uber’s Michelangelo platform demonstrates another dimension: when FDE-style deliveries repeatedly succeed across multiple business scenarios, the organization eventually platformizes the approach. Uber’s public materials show that Michelangelo covers the end-to-end ML lifecycle; by a 2024 public article, it had about 400 active projects, 20,000+ training runs per month, 5,000+ production models, and peak real-time predictions of 10 million per second. By a 2025 article on model deployment safety, Uber disclosed that peak real-time predictions exceeded 15 million per second, and introduced health measurement, rollout gates, and instant rollback throughout the lifecycle. This case tells FDE teams two things: First, field delivery must eventually crystallize into platform capabilities. Second, model deployment safety is not the final pre-launch test, but an automated safeguard spanning the entire lifecycle, starting from data and code artifacts.

Looking at these three cases together, the experience can almost be summarized in one sentence: Morgan Stanley tells you to build evaluation and trust first; Klarna tells you not to over-agent the system; Uber tells you that successful cases must be platformized, otherwise, technical debt will swallow you as delivery scale grows.

▍Security, Compliance, and Operational Risk

The most underestimated part of FDE is not model capability, but the speed at which the risk surface expands. The NIST AI RMF divides AI risk management into four functions: Govern, Map, Measure, and Manage, emphasizing it as a voluntary framework for the design, development, deployment, and use phases. NIST also released the GenAI Profile as a companion resource for generative AI. For FDE teams, this means security and compliance cannot be left to a pre-launch “review meeting”; they must be embedded in daily engineering artifacts: versions, logs, approvals, tests, rollbacks, and accountability must be designed in from development.

In threat modeling, the OWASP LLM Top 10 highlights issues like prompt injection, insecure output handling, training data poisoning, model DoS, supply chain vulnerabilities, sensitive information disclosure, and excessive agency. The MITRE ATLAS systematizes adversarial tactics and techniques for AI systems. The practical implication for FDE is: you are no longer delivering a regular web app, but a composite system that can be attacked by user input, external documents, plugins/tools, model supply chains, and long-running agents simultaneously.

Therefore, production-grade FDE systems generally require at least six mitigation layers. The first is permission minimization: tool calls must be scoped to the minimum necessary. Palantir’s AI FDE documentation emphasizes adhering to existing permission models and exposing context based on permissions. The second is approval and human review: OpenAI Agents documentation and security best practices emphasize guardrails, human review, moderation, and adversarial testing. The third is harness-level observability: Anthropic’s research on agent sabotage and monitoring points out that the key control point for long-running agents is at the harness/runtime envelope layer, because all actions pass through it, making observation and verification most suitable here. The fourth is supply chain integrity: SLSA aims to improve build integrity and provenance, reducing the risk of tampering with artifacts and build chains. The fifth is content and data protection: Moderation, PII redaction, output validation, and structured constraints should be built-in. The sixth is runtime kill switch: once abnormal tool calls, unauthorized reads, incorrect routing, or cost runaway are detected, the system must be able to pause, isolate, or roll back.

On the compliance side, global teams must also face regulatory timelines. The EU AI Act’s formal regulation is Regulation (EU) 2024/1689. Public implementation timeline materials show it entered into force on August 1, 2024, with most provisions applying from August 2, 2026, and different category obligations phased in based on risk and scope. For FDE teams serving Japanese, European, or cross-border business, this means that the logs, audits, documentation, model inventories, and responsibility boundaries you design now will likely determine compliance costs two years later.

A truly mature FDE team turns these requirements into engineering defaults, not things to “patch when security colleagues ask.” This is also why I favor systems based on policy-as-code, contracts-as-code, evals-as-code, and deployment-as-code, rather than approaches relying on “experience and manual review to handle everything.”

▍KPIs, Monitoring, and Team Roadmap

FDE KPIs cannot just look at “call volume” or “model accuracy.” A more reasonable approach is to divide them into four layers:

Business layer: adoption rate, completion rate, human agent replacement rate, average handling time, profit or cost improvement. Both Morgan Stanley and Klarna’s public cases show that enterprises truly care about adoption, workflow impact, and unit output, not model benchmarks.

Quality layer: eval scores, human review pass rate, task completion rate, routing accuracy, retry rate, change regression.

System layer: latency, traffic, errors, saturation, along with queue length, GPU utilization, timeout rate, and rollback frequency. Google SRE’s “Four Golden Signals” remain the most effective skeleton.

Data layer: freshness, schema breakage, missing features, training/inference skew, contract violations, quality check pass rate.

In monitoring implementation, OpenTelemetry has given a very clear direction: observability relies on traces, metrics, and logs. The Collector receives, processes, and exports telemetry. The OpenAI Agents SDK treats model calls, tool calls, handoffs, guardrails, and custom spans as traceable run records. MLflow also integrates observability, evaluation, prompt management, and model management into a unified platform. For FDE teams, a practical standard is: for any online task failure, you should be able to see input, retrieval, routing, tool calls, model responses, human intervention, and final results in a single trace. If you can’t, the system has not truly entered a production-maintainable state.

Furthermore, KPIs must be tied to release control. Google SRE’s error budget concept is very suitable for FDE: when critical SLOs are continuously eroded, the team should pause high-risk changes and prioritize fixing stability issues. Uber’s safe deployment practices prove that translating health measurements into rollout gates, alerts, and instant rollbacks truly balances delivery speed and production safety.

The following roadmap is what I believe most teams can execute for FDE progression:

The basis for this roadmap is straightforward: OpenAI’s FDE responsibility definition emphasizes going from discovery to stable production; CD4ML emphasizes releasing code/data/models in small, safe, and reproducible steps; Uber’s practice shows that rollout gates and automatic rollbacks must enter the lifecycle early; both Anthropic and OpenAI place eval/trace at the core of agent system reliability.

For team capability checklist, I suggest covering at least the following seven items, treating them as a team’s combined capabilities, not requiring any single person to master all:

Business modeling: ability to convert vague processes into actionable system boundaries and KPIs.
Agent and workflow orchestration: knowing how to distinguish single agent, graph orchestration, multi-agent, deterministic pipeline.
Data and contract governance: using contracts, quality checks, freshness, and lineage to control upstream changes.
Model lifecycle and evaluation: capable of registry, dataset lineage, trace-based eval, and regression comparison.
Platform and release: understanding Kubernetes, GitHub Actions, GitOps, canary, rollback.
Observability and SRE: able to design dashboards and alerts around traces/metrics/logs/SLO.
Security and compliance: able to make prompt injection, permissions, supply chain, and audit into default controls.

Technical Appendix

Technical Appendix: Minimal Viable Snippets

Stateful Agent Graph Orchestration

from langgraph.graph import StateGraph
graph = StateGraph(State)
graph.add_node("retrieve", retrieve)
graph.add_node("act", act)
graph.add_edge("retrieve", "act")

Reference: langchain-ai/langgraph and official overview.

KServe Inference Serving

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
spec.predictor.model.storageUri: s3://...

Reference: kserve/kserve and KServe getting started/genai docs.

BentoML Exposing OpenAI-compatible LLM API

class LLM:
    def __command__(self):
        return ["vllm", "serve", model_id]

Reference: bentoml/BentoVLLM, and BentoML’s vLLM and LangGraph examples.

dbt Data Contract

config:
  contract:
    enforced: true
columns:
  - name: customer_id
    data_type: string

Reference: dbt-labs/dbt-core and dbt contracts/model contracts docs.

Feast Online Feature Retrieval

store = FeatureStore(repo_path=".")
features = store.get_online_features(...)

Reference: feast-dev/feast and Feast architecture/online store docs.

OpenTelemetry Unified Observability

from opentelemetry import trace
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("agent_run"):
    ...

Reference: OTel docs and Collector docs; suitable for including agent run, tool calls, retrieval, and approval together in one trace.

GitHub Actions + GitOps Release

# .github/workflows/deploy.yml
on: [push]
jobs: build-test-deploy

Reference: GitHub Actions docs + Argo CD declarative GitOps.

Data Quality Gate

@asset_check(asset=my_table)
def no_nulls(...): ...

Reference: dagster-io/dagster and Dagster asset checks.

Open Questions and Notes

The term FDE itself is still a rapidly evolving industry term. Currently, the most authoritative public sources come from official role descriptions, product documentation, and customer stories from organizations like OpenAI, Palantir, and Anthropic, rather than a unified academic standard. Therefore, this article’s definition of FDE is an engineering synthesis based on official primary sources, focusing on AI/ML production deployment rather than all possible job semantics. Meanwhile, cloud vendor capabilities and pricing pages change rapidly in 2026; the “cost signal” in the table is better suited for directional judgment rather than procurement quotes. Cases like Klarna and Morgan Stanley are mainly from official customer stories and public technical sharing, suitable for extracting architecture and governance lessons, but should not be treated as independent audit reports.

▍Primary References

https://openai.com/careers/forward-deployed-engineer-%28fde%29-nyc-new-york-city/
https://martinfowler.com/articles/cd4ml.html
https://docs.langchain.com/oss/python/langgraph/overview?utm_source=chatgpt.com
https://openai.com/index/openai-launches-the-deployment-company/
https://palantir.com/docs/foundry/ai-fde/overview/
https://airflow.apache.org/docs/apache-airflow/stable/index.html
https://docs.feast.dev/
https://kserve.github.io/website/docs/intro
https://forum.openai.com/public/videos/technical-success-office-hours-swam-11-14-2024
https://kubernetes.io/zh-cn/docs/concepts/workloads/controllers/deployment/
https://aws.amazon.com/sagemaker/ai/
https://openai.com/careers/forward-deployed-engineer-gov-washington-dc/
https://docs.getdbt.com/docs/mesh/govern/model-contracts
https://docs.bentoml.com/en/latest/examples/vllm.html
https://aws.amazon.com/sagemaker/ai/pricing/
https://learn.microsoft.com/en-us/azure/machine-learning/overview-what-is-azure-machine-learning?view=azureml-api-2
https://azure.microsoft.com/en-us/products/machine-learning
https://cloud.google.com/products/gemini-enterprise-agent-platform
https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform
https://www.kubeflow.org/docs/started/introduction/?utm_source=chatgpt.com
https://openai.com/index/morgan-stanley/
https://openai.com/index/klarna/
https://www.uber.com/us/en/blog/michelangelo-machine-learning-platform/
https://www.nist.gov/itl/ai-risk-management-framework
https://owasp.org/www-project-top-10-for-large-language-model-applications/
https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng
https://developers.openai.com/cookbook/examples/partners/agentic_governance_guide/agentic_governance_cookbook
https://opentelemetry.io/docs/what-is-opentelemetry/?utm_source=chatgpt.com
https://sre.google/workbook/error-budget-policy/
https://docs.getdbt.com/reference/resource-configs/contract
https://mlflow.org/docs/latest/ml/model-registry/
https://papers.neurips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
https://www.uber.com/us/en/blog/from-predictive-to-generative-ai/
https://sre.google/sre-book/monitoring-distributed-systems/
https://github.com/langchain-ai/langgraph
https://github.com/kserve/kserve
https://github.com/bentoml/BentoVLLM
https://github.com/dbt-labs/dbt-core
https://github.com/feast-dev/feast
https://docs.github.com/actions/get-started/quickstart
https://github.com/dagster-io/dagster

@ba_niu80557: https://x.com/ba_niu80557/status/2069042546886787419

Knowing How to Call an API Doesn’t Make You an FDE, and Knowing How to Build an Agent Doesn’t Either: A Deep Dive into What FDE Really Means in AI Deployment

▍Abstract

▍Definition and Boundaries of FDE

▍Architecture Patterns and Key Components

▍Engineering Implementation, Open Source Stack, and Tool Selection

▍Case Studies and Lessons Learned

▍Security, Compliance, and Operational Risk

▍KPIs, Monitoring, and Team Roadmap

▍Primary References

Similar Articles

@dotey: https://x.com/dotey/status/2055307775417139447

@vasuman: https://x.com/vasuman/status/2057177266984226892

Forward Deployed Engineering 101

@ba_niu80557: https://x.com/ba_niu80557/status/2069067697997107485

@AndrewYNg: One of the new, buzzy jobs in Silicon Valley is the AI Forward Deployed Engineer (FDE), an engineer who is embedded wit…

Submit Feedback

Similar Articles

@dotey: https://x.com/dotey/status/2055307775417139447

@vasuman: https://x.com/vasuman/status/2057177266984226892

Forward Deployed Engineering 101

@ba_niu80557: https://x.com/ba_niu80557/status/2069067697997107485

@AndrewYNg: One of the new, buzzy jobs in Silicon Valley is the AI Forward Deployed Engineer (FDE), an engineer who is embedded wit…