Getting Started Beginner 9 min read

AI Safety and Privacy: What to Share and What Not To

Practical rules for protecting your sensitive data while getting the most out of AI tools

When I first started using ChatGPT, I made a rookie mistake that still makes me cringe. I was working on a client project and thought, "Hey, why not paste this entire database schema into ChatGPT to help me optimize it?" Thankfully, my brain kicked in just before I hit enter. That schema contained sensitive business logic, client names, and proprietary structures that had no business being on OpenAI's servers.

That moment taught me something crucial: AI tools are incredibly powerful, but we need to be smart about what we feed them. Today, I want to share the practical rules I've developed for staying safe while still getting amazing results from AI.

The Golden Rule: Treat AI Like a Public Forum

Here's the simplest way to think about AI safety: imagine everything you type into an AI tool could end up on a billboard with your name on it. Would you still share it?

Most AI companies use your conversations to improve their models (unless you specifically opt out). Even when they promise not to train on your data, there's always some level of risk. Servers get hacked, employees make mistakes, and policies change.

Privacy Settings Matter

Both ChatGPT and Claude offer options to disable training on your data. Turn these on in your account settings before sharing anything sensitive.

What's Safe to Share: The Green Light List

Let me start with what I freely share with AI tools without losing sleep:

Public information: Anything that's already out there on the internet. Company websites, published articles, public documentation, open-source code repositories.

Generic examples: When I need help with a coding problem, I create simplified, anonymized versions. Instead of sharing my actual e-commerce database structure, I'll ask about a generic "products and users" relationship.

Learning materials: Study notes, practice problems, general concepts you're trying to understand. AI is fantastic for education when you're not dealing with proprietary information.

Creative work you own: Your blog posts, stories, art descriptions, personal projects. Just remember that once you share it, it might influence the AI's future responses to others.

safe-example
# Instead of sharing actual proprietary code:
"How do I optimize this query: SELECT * FROM customer_orders WHERE..."

# Share a generic version:
"How do I optimize a query that joins users and orders tables?"

The Red Zone: Never Share These

Here's my absolute "never share" list, learned through experience and a few close calls:

API keys and passwords: This should be obvious, but I've seen people accidentally paste entire config files with secrets included. Always scrub these first.

Personal identifying information: Real names, addresses, phone numbers, email addresses, social security numbers. Even if you're asking for help with a contact form, use fake data.

Financial information: Bank account numbers, credit card details, transaction histories, salary information, financial projections that aren't public.

Medical information: Health records, medication lists, diagnoses. HIPAA exists for a reason, and AI tools don't count as covered entities.

Legal documents: Contracts, NDAs, patent applications, anything involving legal strategy. Your lawyer would not approve.

Proprietary business data: Customer lists, internal processes, competitive strategies, unreleased product details, internal communications.

The Yellow Zone: Proceed with Caution

Some data falls into a gray area where you need to make judgment calls:

Work-related but not sensitive: General coding problems, public documentation questions, industry-standard processes. I often ask about React patterns or database design concepts using my work context, but I anonymize everything first.

Educational content: If you're a teacher or trainer, your curriculum might be okay to share, but be mindful of institutional policies.

Semi-public information: Things like LinkedIn profiles or public GitHub repos are technically shareable, but consider whether you want that data connected to your AI usage patterns.

The Anonymization Test

Before sharing anything in the yellow zone, ask yourself: "If this appeared in a data breach tomorrow, would it hurt me or my company?" If yes, anonymize it further or don't share it.

My Practical Sanitization Process

Here's the system I use when I want AI help with work-related problems:

Step 1: The Find and Replace Dance
I replace all real names with generic ones (Company A, User B, Database C), all specific technologies with generic equivalents when possible, and all proprietary terms with industry-standard alternatives.

Step 2: The Simplification Filter
I strip out everything that's not directly related to my question. If I need help with a function, I don't need to include the entire file structure or business context.

Step 3: The Outsider Test
I imagine a competitor reading my sanitized version. Would they learn anything valuable about our business? If yes, I simplify further.

sanitization-example
# Original (DON'T SHARE):
function processAcmeCorpPayment(creditCard, amount) {
// Proprietary fraud detection logic
}

# Sanitized version:
function processPayment(paymentMethod, amount) {
// Need help with validation logic here
}

Tool-Specific Considerations

ChatGPT: Offers chat history controls and opt-out options. The paid version has stricter data handling policies than the free tier.

Claude: Anthropic has strong privacy commitments, but the same general rules apply. They don't train on conversations, but that doesn't mean zero risk.

Coding assistants (Copilot, Cursor): These see your entire codebase context. Be extra careful about API keys in config files and sensitive comments in your code.

Specialized AI tools: Each tool has different privacy policies. Grammar checkers, image generators, and voice assistants all handle data differently.

Building Good Habits

The goal isn't to be paranoid—it's to be intentional. I've developed these habits that now feel natural:

Always pause before pasting: That extra second to think "Is this okay to share?" has saved me multiple times.

Create templates for common tasks: I have sanitized versions of common coding patterns I frequently ask about. This saves time and reduces risk.

Use separate accounts: I keep my personal AI experiments separate from anything work-related. Different contexts, different risk levels.

Stay updated on policies: AI companies change their data handling practices. What was safe last year might not be safe today.

The Bottom Line

AI tools are incredibly valuable, but they're not magic black boxes that exist outside the normal rules of data security. Every piece of information you share is a small bet you're making about the future—betting that it won't come back to hurt you.

The good news? With a little thought and preparation, you can get 90% of AI's benefits while taking on almost none of the risk. Start with the assumption that your data isn't private, work backward from there, and you'll develop an intuition for what's safe to share.

Remember: it's always easier to share more later than to take back what you've already shared. When in doubt, err on the side of caution. Your future self will thank you.

Want to go deeper?

Check out more tutorials in this category, or explore the full site.