Getting the list of classes in a given Python file

My goal is to fetch the list classes defined in a given Python file.

Following this link, I have implemented the following:

File b.py:

import imp
import inspect

module = imp.load_source("__inspected__", './a.py')
class_members = inspect.getmembers(module, inspect.isclass)
for cls in class_members:
    class_name, class_obj = cls
    member = cls[1]
    print(class_name)

File a.py:

from c import CClass


class MyClass:
    name = 'Edgar'

    def foo(self, x):
        print(x)

File c.py:

c_var = 2

class CClass:
   name = 'Anna'

I have two issues with this implementation. First, as is mentioned in the post, classes of imported module are printed out as well. I can’t understand how to exclude them Second, looks like the imp file is depreciated in favour of importlib, but the doc seems sketchy. And I can’t figure out how to refactor my solution. Any hints ?

Answer

So to use importlib similarly to how you’re using imp, you can look at this: Python 3.4: How to import a module given the full path? and you get something like the following:

import importlib.machinery
import inspect

module = importlib.machinery.SourceFileLoader("a", './a.py').load_module()
class_members = inspect.getmembers(module, inspect.isclass)

Solution #1: Look up class statements in the Abstract Syntax Tree (AST).

Basically you can parse the file so that you can get the class declaration statements.

import ast

def get_classes(path):
    with open(path) as fh:        
       root = ast.parse(fh.read(), path)
    classes = []
    for node in ast.iter_child_nodes(root):
        if isinstance(node, ast.ClassDef):
            classes.append(node.name)
        else: 
            continue
    return classes
    
for c in get_classes('a.py'):
    print(c)

Solution #2: Look at imports and ignore import from statements.

This is more in-line with your current approach, but is a little jankier. You can look for things imported by the file you’re looking at and select out the import from statements (Python easy way to read all import statements from py module) and just make sure that none of the things imported show up later:

import ast
from collections import namedtuple

Import = namedtuple("Import", ["module", "name", "alias"])

def get_imports(path):
    with open(path) as fh:        
       root = ast.parse(fh.read(), path)

    for node in ast.iter_child_nodes(root):
        if isinstance(node, ast.Import):
            # We ignore direct imports
            continue
        elif isinstance(node, ast.ImportFrom):  
            module = node.module.split('.')
        else:
            continue
        for n in node.names:
            yield Import(module, n.name.split('.'), n.asname)

imported = set()
for imp in get_imports('a.py'):
    imported_classes.add(imp.name[0] if not imp.alias else imp.alias)

Then you can just filter out the imported things you saw.

for c in class_members:
    class_name, class_obj = c
    member = c[1]
    if class_name not in imported:
        print(class_name)

Note that this currently doesn’t distinguish between imported classes and imported functions, but this should work for now.