{"id":13948,"date":"2026-01-28T15:31:17","date_gmt":"2026-01-28T15:31:17","guid":{"rendered":"https:\/\/www.blopig.com\/blog\/?p=13948"},"modified":"2026-02-03T16:21:26","modified_gmt":"2026-02-03T16:21:26","slug":"advanced-pymol-visualization-for-weighted-structural-ensembles-part-2-efficient-weighted-sasa-surfaces","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2026\/01\/advanced-pymol-visualization-for-weighted-structural-ensembles-part-2-efficient-weighted-sasa-surfaces\/","title":{"rendered":"Advanced PyMOL Visualization for Weighted Structural Ensembles (Part 2): Efficient Weighted SASA Surfaces"},"content":{"rendered":"\n<p>In Part 1, we covered reference state handling, RMSD-based coloring, and cluster visualization for weighted structural ensembles. Now we tackle a more ambitious goal: generating <strong>solvent-accessible surface area (SASA) surfaces<\/strong> that reflect the <em>weighted<\/em> conformational distribution of your ensemble.<\/p>\n\n\n\n<p>Why surfaces? Because they show the <strong>accessible conformational space<\/strong>\u2014where your protein can actually be found, weighted by population. This is particularly powerful when comparing different fitting methods or showing how experimental constraints reshape the ensemble.<\/p>\n\n\n\n<p>The challenge? A typical ensemble might have 500+ frames, each generating thousands of surface points. Naive approaches choke on the computational and memory demands. This post shares the optimizations that make weighted SASA visualization practical.<\/p>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"wp-block-heading\">Why SASA Surfaces for Ensemble Visualization<\/h2>\n\n\n\n<p>Standard ensemble visualization overlays structures as ribbons or cartoons. This works for small ensembles but becomes an unreadable mess beyond ~20 structures. Surfaces offer an alternative: instead of showing individual structures, show the <strong>envelope<\/strong> of accessible conformations.<\/p>\n\n\n\n<p>With weighted ensembles, we can go further. Rather than treating all frames equally, we weight surface points by their frame&#8217;s population weight. High-weight regions appear denser; low-weight regions fade away. The result is a surface that reflects the <em>experimentally-consistent<\/em> conformational distribution.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2026\/01\/image-19.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" width=\"625\" height=\"388\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2026\/01\/image-19.png?resize=625%2C388&#038;ssl=1\" alt=\"\" class=\"wp-image-13949\" srcset=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2026\/01\/image-19.png?w=820&amp;ssl=1 820w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2026\/01\/image-19.png?resize=300%2C186&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2026\/01\/image-19.png?resize=768%2C477&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2026\/01\/image-19.png?resize=624%2C387&amp;ssl=1 624w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/figure>\n\n\n\n<p><em>(Left) PCA plots of MD at pH4 (top, blue) and pH7 (bottom, grey). (Right) Weighted-ensemble SASA envelopes. Teal: pH4 (Disease-like), Grey: pH7 (Standard).<\/em><\/p>\n\n\n\n<p>The workflow is:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Compute per-atom SASA for each frame<\/li>\n\n\n\n<li>Generate surface points around exposed atom<\/li>\n\n\n\n<li>Assign weights to points based on their source frame<\/li>\n\n\n\n<li>Filter by weighted local density to show the high-confidence envelope<\/li>\n\n\n\n<li>Render as mesh, spheres, or points<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">Computing Per-Atom SASA with FreeSASA<\/h2>\n\n\n\n<p>We use <a href=\"https:\/\/freesasa.github.io\/\">FreeSASA<\/a> for fast SASA calculations, with a fallback to van der Waals estimates if FreeSASA isn&#8217;t available.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from Bio.PDB import PDBParser\nimport numpy as np\n\ndef compute_sasa_biopython(pdb_file, align_residues=None):\n    \"\"\"\n    Compute per-atom SASA values for a structure.\n    Returns coords, SASA values, and alignment coords.\n    \"\"\"\n    parser = PDBParser(QUIET=True)\n    structure = parser.get_structure('protein', pdb_file)\n    \n    # Get alignment coordinates (for later superposition)\n    align_coords = []\n    for model in structure:\n        for chain in model:\n            for residue in chain:\n                resseq = residue.get_id()[1]\n                if align_residues and resseq not in align_residues:\n                    continue\n                if 'CA' in residue:\n                    align_coords.append(residue['CA'].get_coord())\n    \n    # Try FreeSASA first\n    try:\n        import freesasa\n        import contextlib\n        import os\n        \n        # Suppress FreeSASA's verbose output\n        with contextlib.redirect_stdout(open(os.devnull, 'w')), \\\n             contextlib.redirect_stderr(open(os.devnull, 'w')):\n            sasa_result = freesasa.calc(freesasa.Structure(pdb_file))\n        \n        atom_sasa = []\n        atom_coords = []\n        for model in structure:\n            for chain in model:\n                for residue in chain:\n                    for atom in residue:\n                        sasa = sasa_result.atomArea(atom)\n                        atom_sasa.append(sasa)\n                        atom_coords.append(atom.get_coord())\n        \n        return {\n            'coords': np.array(atom_coords),\n            'sasa': np.array(atom_sasa),\n            'align_coords': np.array(align_coords)\n        }\n    \n    except ImportError:\n        # Fallback: estimate from van der Waals radii\n        vdw_radii = {'C': 1.70, 'N': 1.55, 'O': 1.52, 'S': 1.80, 'H': 1.20}\n        probe_radius = 1.4\n        \n        atom_sasa = []\n        atom_coords = []\n        for model in structure:\n            for chain in model:\n                for residue in chain:\n                    for atom in residue:\n                        elem = atom.element if atom.element else 'C'\n                        radius = vdw_radii.get(elem, 1.70)\n                        # Approximate SASA as sphere surface area\n                        sasa = 4 * np.pi * (radius + probe_radius) ** 2\n                        atom_sasa.append(sasa)\n                        atom_coords.append(atom.get_coord())\n        \n        return {\n            'coords': np.array(atom_coords),\n            'sasa': np.array(atom_sasa),\n            'align_coords': np.array(align_coords)\n        }<\/pre>\n\n\n\n<p>For an ensemble, we process each frame:<br><\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def compute_sasa_ensemble(pdb_files):\n    \"\"\"Compute SASA for all structures in ensemble.\"\"\"\n    all_sasa_data = []\n    \n    for pdb_file in pdb_files:\n        sasa_data = compute_sasa_biopython(pdb_file)\n        if sasa_data:\n            all_sasa_data.append(sasa_data)\n    \n    return all_sasa_data<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Generating Surface Geometry from SASA Values<\/h2>\n\n\n\n<p>For each atom with significant SASA, we generate points on a sphere proportional to its exposed surface area. This creates a point cloud approximating the molecular surface.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def get_raw_surface_geometry(sasa_data_list, probe_radius=1.4, \n                             sasa_threshold=0.0, frame_subsample=500):\n    \"\"\"\n    Generate surface points from SASA data.\n    \n    Args:\n        sasa_data_list: List of SASA data dicts from compute_sasa_ensemble\n        probe_radius: Probe radius used in SASA calculation\n        sasa_threshold: Minimum SASA to include an atom\n        frame_subsample: Max points per frame (controls memory usage)\n    \n    Returns:\n        surface_points: (N, 3) array of surface coordinates\n        frame_indices: (N,) array mapping each point to its source frame\n    \"\"\"\n    all_surface_points = []\n    all_frame_indices = []\n    \n    for frame_idx, data in enumerate(sasa_data_list):\n        coords = data['coords']\n        sasa_values = data['sasa']\n        frame_points = []\n        \n        # Only process atoms with significant SASA\n        valid_mask = sasa_values >= max(sasa_threshold, 0.01)\n        valid_coords = coords[valid_mask]\n        valid_sasa = sasa_values[valid_mask]\n        \n        for coord, sasa in zip(valid_coords, valid_sasa):\n            # Surface radius from SASA: A = 4\u03c0r\u00b2 \u2192 r = \u221a(A\/4\u03c0)\n            surface_radius = np.sqrt(sasa \/ (4 * np.pi)) + probe_radius\n            \n            # Number of points scales with surface area\n            n_points = max(2, int(sasa \/ 5))\n            \n            # Generate spherical coordinates\n            phi = np.linspace(0, 2 * np.pi, n_points)\n            theta = np.linspace(0, np.pi, max(2, n_points \/\/ 2))\n            P, T = np.meshgrid(phi, theta)\n            \n            # Convert to Cartesian\n            X = coord[0] + surface_radius * np.sin(T) * np.cos(P)\n            Y = coord[1] + surface_radius * np.sin(T) * np.sin(P)\n            Z = coord[2] + surface_radius * np.cos(T)\n            \n            points = np.column_stack((X.flatten(), Y.flatten(), Z.flatten()))\n            frame_points.extend(points)\n        \n        # Subsample to control memory\n        if frame_points:\n            frame_points = np.asarray(frame_points)\n            if len(frame_points) > frame_subsample:\n                indices = np.random.choice(len(frame_points), \n                                          frame_subsample, replace=False)\n                frame_points = frame_points[indices]\n            \n            all_surface_points.extend(frame_points)\n            all_frame_indices.extend([frame_idx] * len(frame_points))\n    \n    return (np.array(all_surface_points, dtype=np.float32),\n            np.array(all_frame_indices, dtype=np.int32))<\/pre>\n\n\n\n<p>For a 500-frame ensemble with 500 points per frame, this generates ~250,000 surface points\u2014manageable, but we need to be smart about what comes next.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Caching Intermediate Data<\/h2>\n\n\n\n<p>SASA computation is expensive. Computing it fresh every time you tweak visualization parameters is painful. The solution: cache the surface geometry and frame indices.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def get_raw_surface_geometry_cached(sasa_data_list, cache_file=None, **kwargs):\n    \"\"\"\n    Generate surface geometry with optional caching.\n    \"\"\"\n    # Try loading from cache\n    if cache_file and os.path.exists(cache_file):\n        try:\n            cached = np.load(cache_file)\n            return (cached['surface_points'].astype(np.float32),\n                    cached['frame_indices'].astype(np.int32))\n        except Exception as e:\n            print(f\"Cache load failed: {e}\")\n    \n    # Compute fresh\n    points, indices = get_raw_surface_geometry(sasa_data_list, **kwargs)\n    \n    # Save cache\n    if cache_file and points is not None:\n        try:\n            np.savez_compressed(cache_file, \n                               surface_points=points,\n                               frame_indices=indices)\n        except Exception as e:\n            print(f\"Cache save failed: {e}\")\n    \n    return points, indices<\/pre>\n\n\n\n<p>I use a naming convention that encodes the parameters:<br><\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># Cache file includes ensemble name and trajectory base name\ncache_file = f\"{cluster_name}_{traj_base}_uniform_aligned_surface_cache.npz\"\n\n# For weighted surfaces, include the weights file identifier\nweighted_cache = f\"{cluster_name}_{traj_base}_weighted_aligned_surface_cache.npz\"<\/pre>\n\n\n\n<p>This way, changing your weights file triggers a recompute, but re-running the same analysis loads instantly.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Weighted Density Filtering: The Key Insight<\/h2>\n\n\n\n<p>Here&#8217;s where weighted ensembles get interesting. We have surface points, and each point has a weight from its source frame. But raw point clouds are noisy\u2014we want to show the <strong>high-confidence regions<\/strong> where multiple high-weight frames agree.<\/p>\n\n\n\n<p>The approach: compute <strong>weighted local density<\/strong> for each point, then filter to keep only points above a percentile threshold.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def filter_surface_density(points, point_weights, ci_percentile=95, \n                           density_radius=2.0):\n    \"\"\"\n    Filter surface points by weighted local density.\n    \n    Points in regions where many high-weight frames contribute\n    will have high weighted density and be retained.\n    \n    Args:\n        points: (N, 3) surface coordinates\n        point_weights: (N,) weight for each point\n        ci_percentile: Keep points above this density percentile\n        density_radius: Radius for local density calculation (Angstroms)\n    \n    Returns:\n        Filtered points array\n    \"\"\"\n    from sasa_workers import compute_weighted_densities\n    \n    densities = compute_weighted_densities(\n        points, point_weights, \n        radius=density_radius\n    )\n    \n    threshold = np.percentile(densities, 100 - ci_percentile)\n    return points[densities >= threshold]<\/pre>\n\n\n\n<p>The <code>ci_percentile<\/code> parameter controls how much of the surface you show:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>95%<\/strong>: Show only the highest-density core (tight envelope)<\/li>\n\n\n\n<li><strong>85%<\/strong>: Show moderate density regions (broader coverage)<\/li>\n\n\n\n<li><strong>50%<\/strong>: Show everything above median density<\/li>\n<\/ul>\n\n\n\n<p>For comparing uniform vs weighted ensembles, I typically use 85% to see meaningful differences without too much noise.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Computational Trade-offs: Voxel Grid vs KD-Tree<\/h2>\n\n\n\n<p>Computing local density for 250,000 points is expensive. You need to find all neighbors within <code>density_radius<\/code> for each point\u2014a classic spatial query problem. I&#8217;ve implemented two approaches with different trade-offs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">KD-Tree with Query Pairs (Recommended)<\/h3>\n\n\n\n<p>SciPy&#8217;s <code>cKDTree.query_pairs<\/code> finds all point pairs within a radius in a single call, leveraging internal parallelization:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from scipy.spatial import cKDTree\nimport numpy as np\n\ndef compute_weighted_densities_kdtree(points, weights, radius):\n    \"\"\"\n    Fast weighted density using KD-tree query_pairs.\n    \n    Time complexity: O(N log N) construction + O(N * k) for k neighbors\n    Memory: O(N) for tree + O(pairs) for results\n    \"\"\"\n    points = np.asarray(points, dtype=np.float32)\n    weights = np.asarray(weights, dtype=np.float32)\n    N = len(points)\n    \n    tree = cKDTree(points, compact_nodes=False, balanced_tree=False)\n    \n    # Get all pairs within radius - returns (M, 2) array\n    pairs = tree.query_pairs(r=radius, output_type='ndarray')\n    \n    densities = np.zeros(N, dtype=np.float32)\n    \n    if pairs.size > 0:\n        i, j = pairs[:, 0], pairs[:, 1]\n        # Each point accumulates weights of its neighbors\n        np.add.at(densities, i, weights[j])\n        np.add.at(densities, j, weights[i])\n    \n    # Include self-weight\n    densities += weights\n    \n    return densities<\/pre>\n\n\n\n<p>This is fast and memory-efficient for most ensemble sizes. The <code>query_pairs<\/code> approach is particularly elegant because it naturally handles the symmetry of neighbor relationships.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Voxel Grid (For Very Large Ensembles)<\/h3>\n\n\n\n<p>For ensembles with millions of points, even KD-tree construction becomes slow. A voxel grid approach provides better scaling:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def compute_weighted_densities_voxel(points, weights, radius):\n    \"\"\"\n    Voxel-based weighted density for very large point sets.\n    \n    Time complexity: O(N) for binning + O(N * 27) for neighbor cells\n    Memory: O(N) + O(unique_cells)\n    \"\"\"\n    points = np.asarray(points, dtype=np.float32)\n    weights = np.asarray(weights, dtype=np.float32)\n    N = len(points)\n    \n    cell_size = radius\n    coords_min = points.min(axis=0)\n    \n    # Compute cell indices for each point\n    ijk = np.floor((points - coords_min) \/ cell_size).astype(np.int32)\n    \n    # Hash cells to unique keys\n    keys = ((ijk[:, 0].astype(np.int64) &lt;&lt; 42) ^ \n            (ijk[:, 1].astype(np.int64) &lt;&lt; 21) ^ \n            ijk[:, 2].astype(np.int64))\n    \n    # Group points by cell\n    sort_idx = np.argsort(keys)\n    sorted_keys = keys[sort_idx]\n    unique_keys, key_starts = np.unique(sorted_keys, return_index=True)\n    \n    # Build cell -> point range lookup\n    cell_ends = np.empty_like(key_starts)\n    cell_ends[:-1] = key_starts[1:]\n    cell_ends[-1] = N\n    \n    key_to_range = {\n        int(k): (int(s), int(e)) \n        for k, s, e in zip(unique_keys, key_starts, cell_ends)\n    }\n    \n    densities = np.zeros(N, dtype=np.float32)\n    r_sq = radius * radius\n    \n    # 3x3x3 neighbor offsets\n    offsets = [(dx, dy, dz) \n               for dx in (-1, 0, 1) \n               for dy in (-1, 0, 1) \n               for dz in (-1, 0, 1)]\n    \n    # Process each cell\n    for center_key in unique_keys:\n        start_c, end_c = key_to_range[int(center_key)]\n        center_idx = sort_idx[start_c:end_c]\n        center_pts = points[center_idx]\n        center_w = weights[center_idx]\n        \n        i0, j0, k0 = ijk[center_idx[0]]\n        \n        for dx, dy, dz in offsets:\n            neighbor_key = ((int(i0 + dx) &lt;&lt; 42) ^ \n                           (int(j0 + dy) &lt;&lt; 21) ^ \n                           int(k0 + dz))\n            \n            if neighbor_key not in key_to_range:\n                continue\n            \n            # Avoid double-counting pairs\n            if neighbor_key &lt; center_key:\n                continue\n            \n            start_n, end_n = key_to_range[neighbor_key]\n            neighbor_idx = sort_idx[start_n:end_n]\n            neighbor_pts = points[neighbor_idx]\n            neighbor_w = weights[neighbor_idx]\n            \n            # Compute pairwise distances\n            diff = center_pts[:, None, :] - neighbor_pts[None, :, :]\n            d2 = np.sum(diff * diff, axis=2)\n            mask = d2 &lt;= r_sq\n            \n            if neighbor_key == center_key:\n                # Same cell: exclude self-pairs\n                np.fill_diagonal(mask, False)\n                densities[center_idx] += mask.astype(np.float32) @ center_w\n            else:\n                # Different cells: update both\n                densities[center_idx] += mask.astype(np.float32) @ neighbor_w\n                densities[neighbor_idx] += mask.T.astype(np.float32) @ center_w\n    \n    return densities<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">When to Use Which?<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Method<\/th><th>Best For<\/th><th>Time<\/th><th>Memory<\/th><\/tr><\/thead><tbody><tr><td>KD-Tree<\/td><td>&lt; 500K points<\/td><td>~10s<\/td><td>Moderate<\/td><\/tr><tr><td>Voxel Grid<\/td><td>&gt; 500K points<\/td><td>~30s<\/td><td>Lower<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>In practice, KD-tree handles most ensemble visualization tasks. I include the voxel approach for batch processing or when memory is constrained.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Drawing Surfaces: Mesh, Spheres, and Points<\/h2>\n\n\n\n<p>Once filtered, we need to render the point cloud. PyMOL offers several approaches:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Point Visualization (Fastest)<\/h3>\n\n\n\n<p>Good for quick iteration:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from pymol import cmd, cgo\n\ndef draw_points(points, name, color, max_points=20000):\n    \"\"\"Draw surface as small crosses at each point.\"\"\"\n    if len(points) > max_points:\n        indices = np.random.choice(len(points), max_points, replace=False)\n        points = points[indices]\n    \n    obj = [cgo.BEGIN, cgo.LINES, cgo.COLOR] + list(color)\n    \n    for x, y, z in points:\n        # Draw small cross at each point\n        for dx, dy, dz in [(0.1,0,0), (-0.1,0,0), \n                           (0,0.1,0), (0,-0.1,0), \n                           (0,0,0.1), (0,0,-0.1)]:\n            obj.extend([cgo.VERTEX, x+dx, y+dy, z+dz])\n    \n    obj.append(cgo.END)\n    cmd.load_cgo(obj, name)<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Sphere Visualization (Publication Quality)<\/h3>\n\n\n\n<p>Better looking but slower:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def draw_spheres(points, name, color, radius=0.3, max_spheres=3000):\n    \"\"\"Draw surface as small spheres.\"\"\"\n    if len(points) > max_spheres:\n        indices = np.random.choice(len(points), max_spheres, replace=False)\n        points = points[indices]\n    \n    obj = [cgo.BEGIN, cgo.TRIANGLES, cgo.COLOR] + list(color)\n    \n    for x, y, z in points:\n        # Simple octahedron approximation of sphere\n        vertices = [\n            [x+radius, y, z], [x-radius, y, z],\n            [x, y+radius, z], [x, y-radius, z],\n            [x, y, z+radius], [x, y, z-radius]\n        ]\n        faces = [[0,2,4], [0,4,3], [0,3,5], [0,5,2],\n                 [1,4,2], [1,3,4], [1,5,3], [1,2,5]]\n        \n        for face in faces:\n            for v_idx in face:\n                obj.extend([cgo.VERTEX] + vertices[v_idx])\n    \n    obj.append(cgo.END)\n    cmd.load_cgo(obj, name)<\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Mesh Visualization (Best for Envelopes)<\/h3>\n\n\n\n<p>The mesh approach uses PyMOL&#8217;s built-in Gaussian map to create a smooth surface from the point cloud:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import tempfile\nimport os\n\ndef draw_mesh(points, name, color, max_mesh_points=500000):\n    \"\"\"Draw surface as smooth mesh via Gaussian density map.\"\"\"\n    if len(points) > max_mesh_points:\n        points = points[np.random.choice(len(points), \n                                         max_mesh_points, replace=False)]\n    \n    # Write points as temporary PDB\n    fd, temp_pdb = tempfile.mkstemp(suffix=\".pdb\")\n    os.close(fd)\n    \n    temp_obj = f\"{name}_pts\"\n    map_name = f\"{name}_map\"\n    \n    try:\n        with open(temp_pdb, 'w') as f:\n            for i, (x, y, z) in enumerate(points):\n                f.write(f\"ATOM  {(i%99999)+1:5d}  X   PTS A   1    \"\n                       f\"{x:8.3f}{y:8.3f}{z:8.3f}  1.00  1.00           C\\n\")\n        \n        # Load and create Gaussian map\n        cmd.load(temp_pdb, temp_obj)\n        cmd.hide(\"everything\", temp_obj)\n        cmd.map_new(map_name, \"gaussian\", 1.0, temp_obj, 5.0)\n        \n        # Generate isomesh at appropriate contour level\n        cmd.isomesh(name, map_name, level=0.8)\n        \n        # Color the mesh\n        cmd.set_color(f\"col_{name}\", color)\n        cmd.color(f\"col_{name}\", name)\n        cmd.show(\"mesh\", name)\n        \n        cmd.delete(temp_obj)\n        \n    finally:\n        if os.path.exists(temp_pdb):\n            os.remove(temp_pdb)<\/pre>\n\n\n\n<p>The Gaussian approach creates beautiful smooth envelopes that clearly show the accessible conformational space.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Putting It All Together<\/h2>\n\n\n\n<p>Here&#8217;s the complete workflow for generating weighted SASA surfaces:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">def generate_weighted_surface(cluster_name, trajectory_file, weights_file,\n                              reference_pdb=None, ci_percentile=85,\n                              density_radius=6.5):\n    \"\"\"\n    Complete pipeline for weighted SASA surface generation.\n    \"\"\"\n    from pymol import cmd\n    \n    # 1. Load trajectory\n    cmd.load(topology_file, cluster_name)\n    cmd.load_traj(trajectory_file, cluster_name)\n    \n    # 2. Extract states to PDB files\n    pdb_files = extract_states_to_pdb(cluster_name, output_dir=\"\/tmp\")\n    \n    # 3. Compute SASA for all frames\n    sasa_data = compute_sasa_ensemble(pdb_files)\n    \n    # 4. Align to reference if provided\n    if reference_pdb:\n        target_coords = get_reference_target_coords(reference_pdb)\n        sasa_data = align_sasa_data_to_reference(sasa_data, target_coords)\n    \n    # 5. Generate surface geometry (with caching)\n    cache_file = f\"{cluster_name}_surface_cache.npz\"\n    raw_points, frame_indices = get_raw_surface_geometry_cached(\n        sasa_data, cache_file=cache_file\n    )\n    \n    # 6. Load weights and map to points\n    weights = np.load(weights_file)[\"weights\"]\n    point_weights = weights[frame_indices]\n    \n    # 7. Filter by weighted density\n    filtered_points = filter_surface_density(\n        raw_points, point_weights,\n        ci_percentile=ci_percentile,\n        density_radius=density_radius\n    )\n    \n    # 8. Draw surfaces\n    color_weighted = [0.95, 0.65, 0.30]  # Orange for weighted\n    color_uniform = [0.60, 0.80, 1.00]   # Light blue for uniform\n    \n    draw_mesh(filtered_points, f\"{cluster_name}_weighted_mesh\", color_weighted)\n    \n    # Compare with uniform weighting\n    uniform_points = filter_surface_density(\n        raw_points, np.ones(len(raw_points)),\n        ci_percentile=ci_percentile,\n        density_radius=density_radius\n    )\n    draw_mesh(uniform_points, f\"{cluster_name}_uniform_mesh\", color_uniform)\n    \n    # 9. Clean up temp files\n    for f in pdb_files:\n        os.remove(f)\n    \n    # 10. Final rendering\n    cmd.set(\"mesh_width\", 0.5)\n    cmd.set(\"transparency_mode\", 1)\n    cmd.zoom()<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Parameter Tuning Guide<\/h2>\n\n\n\n<p>The key parameters to adjust:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Parameter<\/th><th>Effect<\/th><th>Typical Range<\/th><\/tr><\/thead><tbody><tr><td><code>frame_subsample<\/code><\/td><td>Points per frame (memory vs detail)<\/td><td>300-1000<\/td><\/tr><tr><td><code>density_radius<\/code><\/td><td>Neighborhood size for density<\/td><td>4.0-8.0 \u00c5<\/td><\/tr><tr><td><code>ci_percentile<\/code><\/td><td>Surface coverage (higher = tighter)<\/td><td>80-95%<\/td><\/tr><tr><td><code>sasa_threshold<\/code><\/td><td>Minimum exposure to include<\/td><td>0.0-1.0 \u00c5\u00b2<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>For comparing methods, keep all parameters identical except the weights. This ensures differences come from the reweighting, not visualization artifacts.<\/p>\n\n\n\n<p><h2 class=\"text-text-100 mt-3 -mb-1 text-[1.125rem] font-bold\">Summary<\/h2><br><p class=\"font-claude-response-body break-words whitespace-normal leading-[1.7]\">Weighted SASA surfaces provide a powerful way to visualize ensemble distributions:<\/p><br><ul class=\"[li_&amp;]:mb-0 [li_&amp;]:mt-1 [li_&amp;]:gap-1 [&amp;:not(:last-child)_ul]:pb-1 [&amp;:not(:last-child)_ol]:pb-1 list-disc flex flex-col gap-1 pl-8 mb-3\"><li class=\"whitespace-normal break-words pl-2\"><strong>Generate surface points<\/strong> proportional to atomic SASA values<strong>Cache intermediate data<\/strong> to enable rapid iteration<strong>Filter by weighted local density<\/strong> to show high-confidence regions<strong>Choose the right algorithm<\/strong> (KD-tree for most cases, voxel for very large ensembles)<strong>Render as mesh<\/strong> for smooth, publication-quality envelopes<\/li><\/ul><br><p class=\"font-claude-response-body break-words whitespace-normal leading-[1.7]\">Combined with the reference handling and RMSD coloring from Part 1, you now have a complete toolkit for visualizing weighted structural ensembles in PyMOL.<\/p><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In Part 1, we covered reference state handling, RMSD-based coloring, and cluster visualization for weighted structural ensembles. Now we tackle a more ambitious goal: generating solvent-accessible surface area (SASA) surfaces that reflect the weighted conformational distribution of your ensemble. Why surfaces? Because they show the accessible conformational space\u2014where your protein can actually be found, weighted [&hellip;]<\/p>\n","protected":false},"author":115,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[647,351,227],"tags":[132],"ppma_author":[725],"class_list":["post-13948","post","type-post","status-publish","format-standard","hentry","category-molecular-dynamics","category-molecular-visualization","category-python-code","tag-pymol"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":725,"user_id":115,"is_guest":0,"slug":"alexi","display_name":"Alexi Hussain","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/aa505acb767aa3aa80da3b68d4684bf5d5064ac6aed2a01206a61383a967535a?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/13948","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/115"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=13948"}],"version-history":[{"count":2,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/13948\/revisions"}],"predecessor-version":[{"id":13959,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/13948\/revisions\/13959"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=13948"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=13948"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=13948"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=13948"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}