Skip to content

Security for AI/ML Product Companies and Enterprise AI Deployments

Prompt-injection testing, RAG and tool-integration security, LLM red-teaming, vector-store attack surface, model supply-chain assessment

By Melina Editorial

The security model in production LLM-powered systems is still being established. The pace of new attack-class discovery — prompt injection variants, agent-tool abuse, vector-store poisoning, system-prompt exfiltration through carefully constructed user input — has outrun most enterprise security programs.

Melina supports two distinct buyer profiles in this space:

  • AI product companies building LLM-powered SaaS, agentic systems, or vertical-AI products
  • Enterprises deploying third-party LLM platforms (Microsoft Copilot, AWS Bedrock-based assistants, in-house GPT-4 / Claude / Gemini integrations) into operational workflows

Where Melina engages on AI/ML security

Prompt injection testing

Direct and indirect prompt injection is the LLM01 entry on the OWASP LLM Top 10 and remains the highest-impact and highest-frequency attack class against deployed LLM systems. We test:

  • Direct prompt injection through user-input channels
  • Indirect prompt injection through retrieved content (RAG sources, tool outputs, third-party APIs)
  • System-prompt extraction and replacement
  • Output-handling abuse — where downstream code processes LLM output as trusted data

Prompt injection testing is the most-requested entry point for our AI/ML engagements.

RAG and tool-integration security

LLM systems with retrieval (RAG) and tool integration have the largest attack surface because every retrieval source and every tool integration is a potential injection vector. We assess:

  • Retrieval source authorization and tenant isolation
  • Vector-store poisoning and embedding manipulation
  • Tool-call authorization boundaries (what can the model invoke, with whose authority)
  • Cross-tenant or cross-context data leakage via shared retrieval infrastructure

Agentic system red-teaming

Agentic systems — LLMs that plan multi-step actions, invoke tools, and operate over extended time horizons — introduce attack scenarios that point-in-time prompt-injection testing does not cover. We run structured red-team exercises against agentic deployments, including:

  • Goal-misalignment scenarios (the agent achieves its stated goal but harms a non-stated constraint)
  • Tool-abuse scenarios (the agent uses tools in unexpected combinations to reach unsafe outcomes)
  • Excessive-agency scenarios (the agent has more authority than the threat model assumed)

Model supply-chain assessment

The model supply chain — base-model weights, fine-tuned versions, embedding models, third-party datasets — is the LLM03 entry on the OWASP LLM Top 10. We assess provenance, integrity verification, and the risk model around third-party model sources for engagements where the supply-chain attack surface is material.

Enterprise AI deployment review

For enterprises deploying third-party AI platforms into operational workflows, the assessment scope is the integration layer — identity propagation, data-source authorization, output-handling, audit logging — rather than the foundation model itself. We provide an integration-layer review tailored to the platform in use.

Service mapping

AI/ML engagements typically draw across:

Compliance and standards frame

AI/ML engagements typically operate within:

  • OWASP LLM Top 10
  • NIST AI Risk Management Framework
  • MITRE ATLAS adversarial-ML threat taxonomy
  • EU AI Act for high-risk-classified systems entering the European single market
  • China-market: the Generative AI Service Management Measures and related guidance for products serving mainland-China users

Engagement model

Single-product prompt-injection testing is typically Scoped Assessment. Agentic-system red-team exercises and enterprise-deployment reviews are typically Custom Engagement. Continuous engagement as the product evolves is typically Retainer.

“The most consequential LLM-application findings we discover live in the gaps between teams, not in any single team’s work product. The input team’s code review doesn’t catch the retrieval team’s authorization gap. The output team’s renderer doesn’t catch what the model can be induced to emit. We’ve published the Five-Boundary Attack-Surface Taxonomy specifically to make those gaps visible at threat-modeling time rather than at production exploitation time.” — Gleb Z., CTO, Melina Security

What buyers ask us first

Three questions surface in nearly every initial conversation with an AI/ML product team or enterprise AI deployer:

“Where in our LLM architecture should we focus assessment investment?” The boundaries between layers — input → retrieval → tool-integration → output → persistence — produce the highest-impact findings because no single team owns the cross-boundary scenarios. Our Five-Boundary Attack-Surface Taxonomy maps these boundaries explicitly; assessment scope shaped around them surfaces findings that within-boundary testing misses.

“How do we choose defense investment for our specific deployment shape?” The right defense posture depends on whether you’re shipping consumer-facing, enterprise with moderate tool authority, agentic, RAG-heavy, or high-authority. The Five-Family Posture Matrix recommends per-shape defense families so investment lands on the controls that matter for your architecture rather than uniform “best practice” guidance that under-protects high-authority systems and over-engineers low-authority ones.

“Can we red-team an agentic system meaningfully before it’s fully built?” Yes, with adjusted scope. Pre-build red-teaming evaluates the trust model and the boundary architecture against the planned tool surface; post-build red-teaming evaluates exploitable conditions in the deployed system. Most mature AI programs run both — pre-build for architecture decisions, post-build for production-readiness validation.

Frequently asked questions

Do you test against our actual production system or only a staging environment?

Whichever the client authorizes. Production testing is common for prompt-injection assessment because most injection scenarios depend on the actual retrieval sources and tool integrations the production system uses. We coordinate test pacing, account isolation, and data-handling to avoid impact on real users — see our Rules of Engagement.

Will testing reveal our system prompt or our proprietary fine-tuning?

The assessment report is delivered under NDA and does not disclose proprietary fine-tuning artifacts or system-prompt contents externally. Internal handling within the client organization is the client’s decision. Where the assessment specifically tests system-prompt extractability, the report describes the extraction mechanism rather than reproducing the extracted prompt in full.

Do you work with on-premises model deployments or only cloud-hosted?

Both. On-premises model deployments often have different attack-surface profile (no public API, narrower threat model on the inference plane, broader threat model on the supply-chain side) and engagement scope is sized to that profile.

How does the assessment work for systems that combine multiple AI components (RAG + agents + fine-tuned models)?

Cross-component systems are scoped around the trust boundaries the Five-Boundary Taxonomy identifies. Where the system spans multiple deployment shapes (e.g., a consumer-facing chat product with an enterprise admin-tool surface), assessment applies the stricter Posture Matrix recommendation across the spanning shapes. The combined assessment typically takes 6-10 weeks for a single product line at moderate complexity.

What happens if findings affect the third-party foundation model rather than our application?

Findings at the foundation-model layer (model behavioural quirks, jailbreak susceptibility for the base model itself) are reported alongside application-layer findings but are typically out of remediation scope for the application team. Where the foundation-model finding is novel or material, we coordinate disclosure with the foundation-model vendor under the client’s authorization. Most application-layer remediation focuses on the integration boundaries the client controls rather than the foundation model itself.