How to validate JSON data in Python?

I need to create a function in Python that validates incoming JSON data and returns it as a Python dictionary. The function should check if all the necessary fields are present in the JSON data and also validate the data types of those fields. I would like to use a try-catch mechanism for error handling. Could you provide some code snippets or examples to help me understand how to Python validate JSON data?

I’ve spent a fair bit of time working with JSON in Python, and a simple way to validate it is by using the built-in json module with a try-except block. Here’s what I do:

import json

def validate_json(json_data, required_fields):
    try:
        # Parse the JSON
        data = json.loads(json_data)
        
        # Validate presence and types of required fields
        for field, expected_type in required_fields.items():
            if field not in data:
                raise ValueError(f"Missing required field: {field}")
            if not isinstance(data[field], expected_type):
                raise TypeError(f"Field '{field}' must be of type {expected_type.__name__}")
        
        return data
    except (json.JSONDecodeError, ValueError, TypeError) as e:
        print(f"Error: {e}")
        return None

json_data = '{"name": "John", "age": 30}'
required_fields = {"name": str, "age": int}
validated_data = validate_json(json_data, required_fields)
print(validated_data)

This approach is straightforward and works well for basic validations like field presence and data types.

Building on what Richa shared, if you’re looking for something more structured and scalable, I recommend using the jsonschema library for validation. Here’s an example:

import json
from jsonschema import validate, ValidationError

def validate_json_with_schema(json_data, schema):
    try:
        data = json.loads(json_data)
        # Validate the JSON data against the schema
        validate(instance=data, schema=schema)
        return data
    except (json.JSONDecodeError, ValidationError) as e:
        print(f"Error: {e}")
        return None

json_data = '{"name": "John", "age": 30}'
schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
    },
    "required": ["name", "age"]
}
validated_data = validate_json_with_schema(json_data, schema)
print(validated_data)

This method adds a layer of robustness because the schema defines exactly what you expect from the JSON structure. Plus, the library handles a lot of edge cases for you.

I agree with Ian’s approach for complex cases, but sometimes I prefer keeping things manual for more control, especially when debugging specific scenarios. Here’s my take on manually validating JSON data:"

import json

def validate_json_data(json_data, required_fields):
    try:
        data = json.loads(json_data)
        
        # Validate fields and their types manually
        for field, expected_type in required_fields.items():
            if field not in data:
                raise KeyError(f"Missing field: {field}")
            if not isinstance(data[field], expected_type):
                raise TypeError(f"Field '{field}' must be of type {expected_type.__name__}")
        
        return data
    except (json.JSONDecodeError, KeyError, TypeError) as e:
        print(f"Validation Error: {e}")
        return None

json_data = '{"name": "John", "age": 30}'
required_fields = {"name": str, "age": int}
validated_data = validate_json_data(json_data, required_fields)
print(validated_data)

This manual approach gives me the flexibility to control error messages and validation rules directly in the code.