**Must Read**

# Rohit Pappu on Molecular Grammar of Condensates – Part 2

Author | Mark Murcko |
---|---|

Type | Kitchen Table Talk |

Topics | |

Keywords |

On August 5, Dewpoint, in partnership with Condensates.com, welcomed back Rohit Pappu, one of the leaders in the condensates field, for the second installment in his 3-part series on the molecular grammar of biomolecular condensates. In this second part, he delved into the relationship between intrinsically disordered proteins and phase behavior. He showed how theory predicts the sequence-encoded conformational heterogeneity of IDPs and demonstrated how these sequence-ensemble relationships are relevant for describing phase behavior by applying the stickers-and-spacers model.

Rohit is the Edwin H. Murty Professor of Engineering and the Director of the Center for Science and Engineering of Living Systems at Washington University in St. Louis. Rohit has made seminal contributions to the field of biomolecular condensates, in particular the drivers of phase transitions that lead to the formation of protein and RNA condensates, and the role that disordered regions play in these cellular processes. Rohit is also a member of Dewpoint’s Scientific Advisory Board and a wonderful advisor, collaborator, and friend.

See Rohit’s second excellent lecture below. Rohit also kindly provided written answers for all of attendees’ questions; those are below as well. Or see part one here and see part three here. Rohit’s talks are part of our Kitchen Table Talk series.

**Create an Account** or Sign In to view the video.

### TRANSCRIPT

Mark Murcko (00:00:00):

Good morning, good afternoon everybody. Greetings from Dewpoint it’s good to see everybody back again, and also some new names and faces. This is the second of a three-part lecture series from Rohit. And I think as everyone knows, Rohit is one of the leaders and pioneers in the field of condensates. And this is an ongoing series of lectures that’s part of our Kitchen Table Talk series, which is all available online on the website, condensates.com. It’s intended to be really an opportunity for prominent researchers like Rohit to share their work and their knowledge with the whole community to help all of us to go faster, which is important in such a complex and interesting field as cellular condensates.

Mark Murcko (00:00:45):

So Rohit, today, will be getting more into intrinsically disordered proteins and their relationship to phase behavior. It’ll be a very interesting lecture, of course, as Rohit’s lectures always are. And although I did mention this last time, I’ll just for those of you who are new, I’ll mention that Rohit is the Edwin H. Murty professor of engineering and the director of the Center for Science and Engineering at Wash U in St. Louis and really has been a key leader. And also, he is a member of Dewpoint’s Scientific Advisory Board and has been a great collaborator, mentor, friend and all of that.

Mark Murcko (00:01:26):

And one last point I’d like to make before handing it off to Rohit is since Rohit’s lecture last week, I’ve received interesting intelligence from several sources about the fact that Rohit at the intrinsically disordered proteins Gordon Conference has been known to lead dancing events. And so, I’m hoping that someone in the audience has video of this, that they could share with the community, that would be a great thing, Rohit off to you.

Rohit Pappu (00:01:57):

Well, on that very important note, let’s get started. Thank you, Jill, thank you, Mark. And I should add that Jill has been very kind in recording the transcripts of the various questions that were posted on the chat because, of course, it’s difficult to get to all of the questions. I have been a little tied down, but my plan is to over the next 24 hours, respond to the questions from lecture one, as well as the questions from today’s lecture. So please feel free to either email me or post your questions on chat.

Rohit Pappu (00:02:34):

So as Mark mentioned, today, we are going to get into the contribution of intrinsically disordered proteins. And I’m going to refer to these as multivalent IDPs. As we discussed last time, the stickers and spacers formalism gives us a very good anchor point. But before we launch in, it’s useful to ask something of a rhetorical question at this juncture, what are intrinsically disordered proteins/regions–they go by the name of IDPs and IDRs…

Rohit Pappu (00:03:05):

The term intrinsic, of course, means that whatever this tendency is to be disordered is encoded in the amino acid sequence. But instead of getting into a sort of long and extravagant definition, I’m going to say that IDPs are really sort of characterized by a defining hallmark, and that is heterogeneity. We recognize and appreciate the heterogeneity of conformations. The fact that an assortment of the distinct conformations might define an IDP. The interconversion has historically been thought to be on rapid timescales, but I think that trope has also been dismantled recently by the beautiful work by Martin Blackledge, demonstrating that interconversions between distinct conformations can span a range of timescales.

Rohit Pappu (00:03:55):

It used to be the case that we all stipulated that IDPs interacted with one another rather weekly, and the thinking was that IDPs were characterized by what were referred to as a weak affinities and high specificities, but that trope too has sort of been dismantled via a lot of recent work notably that of Ben Schuler, who has demonstrated that, in fact, disordered complexes can be extraordinarily high-affinity complexes. And of course, I needn’t to tell this audience that a single IDP can engage itself in a multitude of functions. So really, what I would say is that an IDP or an IDR is characterized by heterogeneity of all forms.

Rohit Pappu (00:04:42):

So one of the things that we have thought about over the years is that in much the same way that the information encoded in amino acid sequences can be connected to the driving forces that dictate, let’s say, a specific fold, so there are sequence-structure relationships, it turns out that there are also codifiable, quantifiable discernible sequence-ensemble relationships. And I’ll sort of get into that first before talking about why that becomes relevant for describing the phase behavior of IDPs.

Rohit Pappu (00:05:16):

So this is a quick summary slide that basically takes something as a personal point of view, but suffice it to say that the work of people like Martin Blackledge, Ben Schuler, Julie Forman-Kay, Tanja Mittag, Gary Daughdrill, Richard Kriwacki and several others, Robert Best has contributed directly to, what I’m depicting here in terms of let’s say two order parameters, the fraction of positive charges, fraction of negative charges. And the fact that IDPs actually come in distinct conformational classes, here is one way to dissect those classes. And I will try to be as modest as possible and make the point that these are predictive diagram of states, they are far from perfect, so please don’t take this to the bank. So they provide sort of guideposts for how we think about the relationship written into amino acid sequences and the types of conformations that they generate.

Rohit Pappu (00:06:14):

So instead of getting into sort of a long-winded palaver over these sequence-ensemble relationships, I will demonstrate to you sort of one collection of techniques that was brought to bear on a very sort of thorny system, cannot be studied using sort of classical methods of structural biology, because huntingtin is an extraordinarily sticky molecule. And so one needs to, if one is interested in what monomeric forms of huntingtin actually look like, we need to work at extremely low concentrations. And that basically puts us into the territory of single-molecule measurements. Before getting there, let’s sort of orient ourselves with the details of the huntingtin exon 1 encoded fragment of interest. So there is an amphipathic stretch, which we’ll refer to as an N17, there is a polymorphic polyglutamine stretch, we’ll refer to as the polyQ region and then there is a proline-rich stretch here referred to C38 because the most important region encompasses two polyproline modules, P11 and P10. And a proline-rich linker.

Rohit Pappu (00:07:21):

Now, put this into your garden variety disorder predictor, and you will get something like this. Essentially, the idea is that there are profiles that go up and down, but I think most people in the audience will reliably conclude that this predictor is telling us that huntingtin exon 1, potentially irrespective of the polyQ length is likely to be disordered. But I should caution that these profiles really only annotate regions that are likely to be conformationally heterogeneous. They do not tell us anything about the nature of the conformational ensembles, i.e. they do not tell us anything about sequence-ensemble relationships.

Rohit Pappu (00:08:02):

So for various reasons, we were deeply interested in this and we’d been doing a lot of work on our own and then we were approached by Hilal Lashuel who’s something of a maven in chemical biology who can make perhaps the most complicated syntheses look really easy. And so what Hilal and his then postdoc John Warner were able to do was to chemically or use semi-synthetic approaches to make highly pure versions of huntingtin exon 1, and then put, during the chemical synthesis, fluorescent labels on there and they worked with Ed Lemke, who was then at EMBL, and now is at Mainz, to actually do single-molecule FRET measurements as a way to then get some information regarding the profile of intra-molecular distances.

Rohit Pappu (00:08:52):

So for each polyQ length, there were three different doubly-labeled constructs, a green label at one end and a red label that was sort of arranged, sat at different positions along the sequence. And so essentially, you do three sets of measurements for each construct. And what you get then is FRET efficiencies for different polyQ lengths. But that in and of itself becomes somewhat uninterpretable because you essentially get what you kind of put in, which is that, oh, the FRET deficiencies decrease as the sequence spacing between the dyes increases. But what does that actually tell us?

Rohit Pappu (00:09:31):

So Kiersten Ruff, who was then a postdoc, now, a staff scientist in the lab, basically has developed this methodology of taking our absent implicit solvation model-based all-atoms simulations, which is instantiated in the CAMPARI package that we and Andreas Vitalis co-distribute, as a way to extract conformational ensembles that describe the totality of single-molecule FRET data in this particular instance. And I should note that the concentrations for these measurements needed to be in the picomolar or ultra-low picomolar ranges, because even at those concentrations, you can start to get concerning effects like oligomerization and things like that, that will clearly confound the conformational distribution.

Rohit Pappu (00:10:21):

So, the step basically involves all-atom simulations that we start with that basically give us some sort of unbiased conformational pool. What Kiersten then does, is essentially uses a rotamer library from which she extracts the various positions and orientations of dyes. So she puts them back into the ensemble for a post-processing step. And for each of the conformations, one can actually use the Förster formula to calculate a confirmation specific FRET efficiency. But of course, what the experiment gives us is an ensemble-averaged FRET efficiency. And in this case, it’s the ensemble of all conformations for an individual molecule.

Rohit Pappu (00:11:13):

And so what we then do, we, as in Kiersten does, is basically uses a maximum entropy reweighting method, whereby she minimizes the deviation between the simulated FRET efficiencies and the experimentally derived FRET efficiencies, thereby if there is any reweighting that is needed for the conformations, that reweighting is actually carried out. But often it turns out that at least for this particular system, the extent of reweighting is quite minimal but the key point is that now we have a sort of a consistent ensemble that is computationally derived and experimentally ratified. What does it look like? And now you can start to see essentially the polyQ domain is shown in orange. The proline-rich region is shown as a semi-flexible tail in the purple-ish colors and the N17 and sort of the multicolored region that’s absorbed on the polyQ domain. But what I hope you will visually discern is that, well, the polyQ length has changed. So essentially, the globular domain size has changed, but in effect, this looks like a tadpole. And so here is, let’s say, a histogram of the types of conformations that one obtains.

Rohit Pappu (00:12:31):

So yes, this is conformationally heterogeneous, but architecturally, all of these conformations sort of have this tadpole-like architecture in the sense that there is a globular head comprising of the polyQ region with the N17 adsorbed on it. There’s a very low secondary structure content, which of course is not true when we are looking at low polyQ lengths. We showed back almost 10 years ago that there was a polyQ length-dependent depreciation in the alpha-helical content and very systematic and detailed studies recently using the [androgen] system from Xavier Salvatella’s lab working with Lindorff-Larsen showed that in fact, this alpha-helicity is real. And Pau Bernardo has shown this as well recently. But that becomes sort of lost effectively as the polyQ length increases.

Rohit Pappu (00:13:29):

So here is, let’s say the overall tadpole morphology quantified using intramolecular distances. And you can see that over in the N17 polyQ domain, the distances are close to one another, but that becomes sort of extended as we go through into the C38 stretch. So why did I go into this? Because it turns out that the sequence-ensemble relationships are actually quite relevant for describing the phase behavior of IDP/IDRs. So in order to do this, let’s sort of take a brief segue into the thermodynamics of let’s say phase separation. Here is a classic Flory-Huggins theory generalized a bit by Muthukumar in this particular case.

Rohit Pappu (00:14:14):

So here is the so-called entropy of mixing phi is the volume fraction and is the number of repeats in our homopolymer. And the key point is that if entropy is the only driving force, in this case, it’s a mixing entropy, then you should never see phase separation or any kind of insolubility. You would only see miscibility. Conversely, we have repeating units interacting with one another in a solvent-mediated way. And the strengths of those effective interactions are quantified by this Flory-chi parameter. This is the so-called two-body parameter. But of course, chains are not infinitesimally thin, they will self-intersect with one another. So the thickness of the chain is captured by the three-body interactions. And so, there is this additional term that Muthu recognized as being important and added to the mean-field model.

Rohit Pappu (00:15:11):

So you can ask the following question: how sticky are polymers for one another? If chi is positive, then the monomers or the repeating units really like one another as opposed to solvent, chi is positive. If the self-interactions are repulsive, that means the chain really likes the solvent, chi is negative. But of course, there is a crossover between these two regimes where the solvent will be sort of agnostic in the sense that the chain doesn’t really care, whether it interacts with itself or with the surrounding solvent and chi is basically zero.

Rohit Pappu (00:15:46):

Those of you who are experimentalists would, of course, ask the question of how might I be able to sort of measure this? And it turns out that your second varial coefficient accessible via scattering measurements or even FCS measurements is something that can be used to extract chi for a typical polymer. The three-body interactions are visualized here in three different ways. Here for a single chain, basically telling us, essentially, in this region the chain cannot interpenetrate beyond this and that’s the single-chain, two, three-body interaction. You can also imagine a pair of chains coming together. And there is, of course, a thwarting of the interpenetration of the two molecules in this case, this three-body junction. And of course, you can also have three molecules coming together, sort of capturing this lack of interpenetrability.

Rohit Pappu (00:16:38):

So the key sort of takeaway is that as chi becomes more positive, we essentially end up in this regime where the solution will separate into two phases, a polymer-rich phase that will coexist with the polymer-dilute phase. Conversely, if chi is negative, you essentially get a fully miscible solution. So years ago, now, there was this sort of well-defined equivalence, which basically said that in a poor solvent, a chain will collapse on itself to minimize the interactions with the surrounding solvent. But if you crank up the concentration, those intramolecular interactions will be replaced with intermolecular interactions, giving rise to a dense phase that forms that is in equilibrium with a dilute phase. So in effect, the single-chain collapse is really a single chains manifestation of phase separation.

Rohit Pappu (00:17:38):

So this equivalence, it turns out, is quite readily realizable for homopolymers. And in fact, you can get something that looks like this: so you can write the free energy in terms of the two-body interactions and the three-body interactions. And what you will then basically be able to sort of, if you minimize the free energy, then plot it in this sort of weird orientation where the Rg is now on the abscissa, temperature on the ordinate, you will see sort of a collapsed transition as we decrease the temperature or an expansion transition as we increase the temperature. And there will be a theta temperature where the chain is basically agnostic.

Rohit Pappu (00:18:18):

So what is beautiful about this is, is actually if I have this profile, which I can either compute or measure, I can actually extract the chi and the three-body interaction coefficients from this profile. And what would be nice is to have a theory that basically allows us to put the values of chi and the values of W, chi will be temperature-dependent, into a theory that allows us to compute the phase boundary, which basically says the shaded region is where the two-phase regime is stable. Namely, that’s when you will realize phase separation outside of the shaded region is when we’ll get the one phase regime and the question is, can we do this? And it turns out that there is actually a beautiful theory developed by Guido Raos and Giuseppe Allegra back in the ’90s, that unlike Flory-Huggins, actually directly connects the single-chain properties to the two-phase behavior, and Xiangze Zeng, a current postdoc in the lab, figured out how to actually use this theory.

Rohit Pappu (00:19:22):

And basically, the protocol looks something like this. You perform all atom simulations of single chains at different temperatures. You use this essentially to estimate the coil to globule transition and really the theta temperature. From this we actually extract what we call the contraction ratio, which is these vis-a-vis, the theta state, how contracted is the chain below the theta temperature. From that profile, the theory allows us to compute the two and three-body interaction coefficients by fitting to the Gaussian Cluster Theory.

Rohit Pappu (00:19:54):

And then once we have that, essentially within about a minute, we have a phase diagram. And this is how it works. Here is let’s say, coil to globule transition simulated now using the ABSINTH model for a series of polyQ lengths. And from that, we get directly the binodal, which is the phase boundaries shown in red, and the instability line, which is the spinodal shown in blue. Okay, so this basically says that if you have nothing but homopolymers, pretty much, we’re all set. No new physics really need to be learned. But life, of course, always gets interesting and challenging, which is, in the last lecture I told you that these proteins that drive phase separation are really associative polymers. I never said they were homopolymers. They are not poly stickers. They have stickers and spacers.

Rohit Pappu (00:20:48):

And so, then, the question becomes: is the framework that’s good for describing the phase behavior for flexible homopolymers importable or transferrable to associative polymers? If so, why? If not, why not? So this was the question that really sort of occupied our interest for a long time. And it sort of, I think, occupied the joint interest really of my good friend, Tanja Mittag, with whom we’ve sort of collaborated now extensively over the past five and a half years. Her postdoc, Erik Martin, who I think writes perhaps the longest emails of all people that I know, and Ivan Peran, who we collaborated with him when he was a graduate student in Dan Raleigh’s lab.

Rohit Pappu (00:21:33):

So having this experimental sort of closing the loop really helped us a lot in answering key questions. So we focused in on Tanja’s favorite system, which is the low complexity domain from the hnRNPA1 protein. And I’ll quickly call your attention to the fact that by no stretch of the imagination is this sequence a homopolymer. So there are two questions that arise here. What are the stickers? What are the spacers? And can one describe this phase behavior using the physics that I just tried to sort of articulate in whirlwind fashion for you by calling on the physics of homopolymers. So what Alex Holehouse did was he said, “Well, let’s first ask what the single-chain simulations look like when we perform simulations using an atomically detailed description based on the ABSINTH implicit solvent force field paradigm and implicit solvation model.”

Rohit Pappu (00:22:34):

So here is a movie which often is sort of worth a thousand words. Basically, you see this chain displaying a heterogeneity of conformational preferences. What are colored here in orange are the aromatic groups, which appear to be the stickers that are sort of drawing this sort of chain together, whereas the spacers that are in between the aromatic groups are trying to stretch the chain apart. And so, we always need touchstones to ask, well, okay, we see some sort of a compaction, what does that actually mean? What that actually means is we can do simulations where we simulate this chain in atomic detail, as a self-avoiding walk, you get the Rg distribution shown in green.

Rohit Pappu (00:23:18):

You can essentially cancel out the intramolecular and be the chain-solvent interactions and ask, what would this chain do if it were to have a theta solvent-like Rg distribution? Shown here in pink. And what do we actually get from the simulations? In this case, at about 310 K, it’s the one that is shown in black. And what you find is that if you were to fit the mean radius of gyration to the scaling relationship, you get a scaling exponent of 0.59 for the self-avoiding walk or self-avoiding random coil which is to be expected, 0.5 for the Gaussian chain, but something that’s lower than that for all-atom simulations.

Rohit Pappu (00:24:05):

So Erik Martin has perfected the art of doing these scattering measurements which is coupled to size exclusion chromatography. These are very challenging systems, but the size exclusion chromatography ensures that one gets very sort of good quality data. And so, these are measurements shown here. And then Erik basically took Alex’s simulated ensembles and he essentially said to him, “I don’t trust your fitting, I will fit and see whether your ensembles are correct.” And in fact, that turned out to be the case. In this particular instance, you get sort of mean radii of gyration from the simulation shown here and from the experiments so they are sort of on the same page.

Rohit Pappu (00:24:51):

One can then use the method developed by Josh Riback when he was a graduate student in Tobin Sosnick’s lab to actually fit a molecular form factor to scattering data. And then, what one gets are these apparent scaling exponents, which essentially make the point that, in terms of congruence with the experiments, it appears that the simulations are effectively saying the same story with regard to the ensemble. So what do these ensembles look like? They are essentially reflecting an intraplay of these aromatic groups trying to compact the chain and the groups that are interspersed between the aromatic residues trying to stretch the chain out. And you see these large fluctuations with average properties that are concordant with something that is a bit more compact than a Flory random coil.

Rohit Pappu (00:25:41):

Now, of course, the minute you say something about any kind of compaction, the obvious question that arises is, ah, is there some sort of secondary structure that is being acquired? And so, Tanja, Ivan and Erik basically put due diligence to sort of get, record NMR spectra. You see this very narrow dispersion along the proton chemical shift, a very narrow chemical shift dispersion along the proton access. All of the peaks were assigned. There actually are some NOEs which we’ll get to in just a second. But the key point is that there’s no real persistent secondary structure when the NMR data are actually analyzed and that is true even from the simulations.

Rohit Pappu (00:26:26):

Now, since I’m not going to be terribly eloquent about NOEs, what I’ll do is tell you about what is really going on from the vantage point of simulations. So what one can do is analyze the pattern of intramolecular distances and calibrate these distances that one quantifies the ensembles, vis-a-vis what one would expect in the self-avoiding walk limit. So the idea here is that if you see patches of blue, then what the chain is doing is actually it’s trying to expand more so than the self-avoiding chain, and I could be wrong, it actually may be the Flory random coil here. Conversely, if it is in red, then you’re actually seeing a compaction with respect to this reference. And what you see is lots of, sort of red patches, and they all generally coincide with the locations of these aromatic groups, indicative of the idea that there’s a distributed network of weak interactions that are sort of drawing the chain together, as the movie tried to suggest.

Rohit Pappu (00:27:29):

So all these data, and I should point out that here we kept pursuing this moral equivalence between single chain ensemble determinants and the determinants of phase behavior. So what we said was, okay, the single-chain ensemble is likely to be governed by these aromatic residues that are sort of pulling the chain together, and so they might be the stickers. And so, of course, that leads to the hypothesis, which can then be tested. So what we did and we hear actually was entirely Tanja and Erik sort of coming up with very interesting designs that they passed along to Alex, and these were called Aro+, which means that you increase the number of aromatic residues, Aro-, which means you sort of depreciate the number of aromatic residues, and then Aro–, which means you further depreciate the number of aromatic residues.

Rohit Pappu (00:28:26):

So the length is fixed. The length is exactly the same. So the aromatics were substituted to serines in this particular case, in some cases to glycines, but mostly to serines. So there isn’t a trivial change to be expected because of changes in length. So the simulations basically suggest something very interesting and striking, which is that as we increase the aromatic content, the chains become more compact; the Rg distribution shifts toward lower values. As we decrease the number of aromatic residues, we essentially see a shift in the Rg distribution toward higher values concordant with the idea that the aromatic residues are driving the compaction. Whereas a depletion of the aromatics is enabling an expansion.

Rohit Pappu (00:29:15):

What’s good for the goose is good for the gander in the sense that you can actually go and do measurements. You here being Erik actually did SAXS measurements for all of these variants overlayed on the solid symbols are the actual fits generated, this is by sort of from the simulation results, the fits are good. So essentially, see that the Rg values decrease as we increase the number (i.e. valence of aromatic residues), and the apparent scaling exponent goes down as we increase the valence (or the number of aromatic residues).

Rohit Pappu (00:29:51):

So this then led us to say, “Well, okay, let’s now do a coarse-grained model where what we’ll do is basically simplify our geometry to be pure stickers and spacers whereby all the aromatics are going to be turned into orange beads, all non-aromatic residues are going to be blue beads.” These are simulations done using an engine called PIMMS developed by Alex Holehouse who was then a postdoc, first a graduate student, then a postdoc in the lab, he now has his own independent position across the park here in the Department of Biochemistry.

Rohit Pappu (00:30:27):

So this is a Monte Carlo simulation engine, uses a single bead per residue on the lattice, and you can either have a phenomenological force field or a learned model. In this case, we decided to go with the learned model, meaning that we use the experimental data for the mean Rg. The simulated ensembles for the Rg histograms, and this is from the all atoms simulations, and we effectively parameterize the sticker-sticker interaction, the sticker-spacer interaction, and the spacer-spacer interaction strengths to recapitulate the Rg values and the Rg distribution.

Rohit Pappu (00:31:06):

So it turns out that there are numerous models that will do just as well in this particular model that we ended up using and transferring across multiple low complexity domains. The sticker-spacer interactions are worth about a third of the sticker-sticker interactions and the spacer-spacer interactions are worth about 15% of these sticker-sticker interactions. So Alex then does these simulations, he uses the experimentally measured phase boundaries to estimate how to convert from simulation units for temperature and then to experimental currency, and also from volume fractions into the actual protein concentrations. So I’ll walk you through this plot, basically shown in blue symbols are the coexisting concentrations at a particular temperature derived from the stickers and spacers simulation.

Rohit Pappu (00:32:06):

Ivan, and Anne Bremer, and Erik in Tanja’s lab then did measurements using the NanoDrop method coupled with absorbance measurements to actually ask, what are the concentrations at these different temperatures that are coexisting concentrations, dense phase, as well as dilute phase? And I should point out that there are not too many of these measurements that have actually been done over the years to make sure that there weren’t any artifacts from the approach that they were using.

Rohit Pappu (00:32:36):

We collaborated with Andrea Soranno, my colleague here at Wash U, who then used FCS measurements to actually measure the coexisting concentrations at one particular temperature showing that, in fact, the measurements that Tanja and company were making were on the right track. Mina Farag, graduate student in the lab, fit these data to the mean-field Flory-Huggins model to be able to estimate the critical point because PIMM simulations, it’s a little hard to get at critical temperatures because, essentially your fluctuations take over the entire box and that becomes a real pain. So essentially, then, the critical temperatures were sort of estimated using the Flory-Huggins fit, which was also quite revealing that the mean-field theory, which is purely about homopolymers actually does a pretty good job of fitting the entire binodal.

Rohit Pappu (00:33:30):

And then, cloud-point measurements done by Ivan, essentially sort of start sort of on one side of the phase boundary and just go around the critical point rather. And then it turns out that that’s one way to sort of estimate the critical point, it turns out everything works out well. Then what you can do is repeat the same process now for Aro+, Aro-, Aro–. It turns out that measurements were impossible to do on Aro– because this was always highly soluble, never really undergoing phase separation at ridiculously high concentrations.

Rohit Pappu (00:34:05):

And in fact, what you see is the width of the tool phase regime increases as the number or the valence of aromatic residues increases. So Aro++ has a higher critical temperature, wider two-phase regime for any particular temperature, followed by wild type, followed by Aro-, and then followed by Aro–. I’ll show you what these actually look like. So just to give you a sense, so what Alex is done here is basically picked the particular quench depth. That means we have sort of, we’ve set up a simulation condition that happens to be this temperature here around five degrees Celsius at this high protein concentration. And what you essentially should see is that the basically pretty much as you start collecting the simulation data, you start to see this sort of one droplet forming that co-exists with a dilute phase.

Rohit Pappu (00:34:58):

Same conditions now, and then you look at what happens with Aro1 or Aro-, essentially, now, you start to see the droplet sort of shedding a lot more molecules. Clearly, the droplet is much more labile. Same conditions for Aro– and effectively, since this is well in the one phase regime, you basically see no droplets formation. So the last sanity check was to say, “Hey, Flory-Huggins, of course, doesn’t really know anything about single-chain behavior. So maybe the fitting that Mina saw was something of an accident.

Rohit Pappu (00:35:35):

Xiangze sort of revisited these data, using the Gaussian Cluster Theory of Roas and Allegra. And indeed, it turns out that now, if you took the single-chain simulations from PIMMS, derive the two-body interaction coefficients and the three-body interaction coefficients, you can calculate full phase boundaries. And in fact, in this case, you can pretty much nail the critical temperature as well, because the theory doesn’t have the weakness of the simulations, we can directly calculate the critical points.

Rohit Pappu (00:36:03):

So the question I set out to ask was, do LCDs, or do these archetypal sticker and spacer systems behave in accord with the expectations of the homopolymers? The answer is, yes; what’s good for the single chains turns out to be good for the multi-chain phase behavior. Well, then the question is, why? So then we reasoned that there must be a sequence patterning here that is engendering this behavior, because effectively it’s somehow saying that the stickers are behaving like they are not impacted much by being stickers that are different from spacers. And so, we thought about a few things, we knew what the stickers were, and so, we came up with this idea of asking the following question, let’s quantify how the stickers are spaced with respect to one another and with respect to the spacers in the actual wild type sequence.

Rohit Pappu (00:37:05):

And in order to do that, you actually enumerate hundreds of thousands of different sequences where you keep the composition fixed, you shuffle the residues with respect to one another, you quantify the shuffling of the stickers and spacers with respect to one another in terms of what we call a segregation or mixing parameter, which is this Omega value. And suffice it to say that if Omega is very close to zero, the stickers are uniformly distributed along the sequence. If the Omega value is close to one, we basically clustered all of the stickers linearly along the sequence, thereby creating a super sticker.

Rohit Pappu (00:37:45):

So simple question, if we were to change this patterning parameter, what would be observed? But first question, what does nature like? So here is something where we basically, since we had all of these shuffles, we can quantify the histogram of Omega values. At random, the Omega value for this composition would be peaked around a number of about 0.55, but the wildtype A1-LCD sits down here at about 0.36. That means that more than 99.999% of the sequences you could dream up of with this composition would have a segregation of stickers that would be higher than the wildtype.

Rohit Pappu (00:38:33):

This looks like a selection principle, which is saying that somehow these sequences are trying to push the stickers apart and distribute them more uniformly along the sequence. So leads to the obvious question, does this patterning matter? So, of course, what we did was we designed a few variants here, what we call Aro Perfect, where we basically make Omega even lower than the wildtype, Aro Patchy, where we make the Omega value higher than the wildtype. And you can see, basically what we’re doing is we’re basically going truly into the outlier. So Patchy sits out here, wildtype sits in the tail of the distribution and Perfect is even more well mixed.

Rohit Pappu (00:39:18):

And it turns out in the simulations, the Perfect makes beautiful droplets. So we haven’t changed the valence mind you, the composition is identical to the wildtype. The wildtype makes droplets, this we’re showing it for one temperature, but the Patchy variant does something really interesting, it makes these sort of micellar looking mesophases, which then start to basically make very long spiny structures, essentially indicative of what should be expected of a system that is precipitating out of solution. And I should also point out that this happens at really low concentrations, at concentrations that are about three to four orders of magnitude lower than what you would see for the sort of lowest concentration threshold for phase separation for the wildtype.

Rohit Pappu (00:40:03):

Those are simulations, what happens in the experiments? Can do lots of quantification. Turns out that, basically, the take-home message is here in the picture, in the fluorescence micrographs, you can actually see Aro Perfect making perfectly nice distributed droplets, that if you let them sit long enough, they will actually grow via Ostwald ripening into one gigantic droplet. Same would be true for the wildtype. But the Aro Patchy, at really low concentrations, basically just crashes out a solution making precipitates.

Rohit Pappu (00:40:34):

And so, then, Alex asked the question of, he said, “Well, okay, so is there a general principle at play here?” And it turns out that what he did was he went through all of the prion-like domains in the IDR-ome, if you will, and asked the question of, “in these prion-like domains when the number of aromatic residues is above a certain threshold, let’s say 10% of the sequence, is it the case that the aromatic residues are, in fact, uniformly well distributed along the sequence?” And the answer is, yes. For all of the LCDs that one finds in these various proteins here, they’re grouped by GO annotations, RNA-binding, vesicular trafficking.

Rohit Pappu (00:41:16):

In fact, Alex sort of was very interested to find that ANXA7 showed up in this annotation and said, “Hey, nobody’s reported this as a candidate for phase separation.” And then, a week after, he made that pronouncement during a walk for coffee, Jennifer Lippincott-Schwartz’s paper came out in Cell showing that indeed that was the case. There was a really beautiful outlier which is Xvelo, the PLD from Xvelo, which many of you will recognize, is one of the main components of Balbiani bodies. And Balbiani bodies, as many of you know, form these very gigantic irregular solid like morphologies.

Rohit Pappu (00:41:56):

And, in fact, Elvan Boke, when she was working with Tim Mitchison and Tony Hyman, tried very hard to make this PLD in such a way that it would actually make a liquid. But it turns out that she and Alex have been working together here, and I should remind you that this is unpublished data, they have been able to re-pattern the PLD while keeping the composition fixed, and now they can make effectively a Balbiani body or a facsimile of it that is more liquid-like. So the takeaway then is valence matters, just like with folded domains, patterning matters.

Rohit Pappu (00:42:34):

And so, there appears to be this, sort of to quote the late Chris Dobson, Life on the Edge behavior going on, where effectively, what these systems are trying to encode into their sequences is this ability to use the multivalence, use the interaction strengths, but it’s the Goldilocks idea. I mean, it’s got to be just right. You cannot have too much of a good thing or too little of a good thing, it needs to be just right. So maybe we’ll just start calling this the Goldilocks principle, but anyway.

Rohit Pappu (00:43:08):

So, that brings me to the last part of the talk, which is, there is more to IDRs and IDPs than homopolymers or low complexity domains. In fact, there’s a heck of a lot of complexity already in these purported low complexity domains. Yes, they behave like homopolymers, but they do so because of an interesting encoding in the grammar of their sequences. Now, there is a gigantic fly in the ointment here, which is that everything that I showed you focuses on low complexity domains and the threshold concentrations for phase separation under the most generous “physiologically relevant solution buffer conditions” is well above 100 molar, micromolar rather, sorry.

Rohit Pappu (00:43:56):

These, we can all agree clearly correspond to maybe highly overexpressed solution conditions. And there is a vibrant debate in the literature seeing these data as a serious challenge for the in vivo relevance of, let’s say, LCD-driven phase separation. So as we head toward, sort of trying to resolve this conundrum, the first thing that comes to mind, of course, is that the proteins from which we derive these low complexity domains are not just low complexity domains, there’s a whole heck of a lot more that’s going on in there. And so, what we started to do was to ask the question of, can we dissect the contributions of the prion-like domains that have garnered a lot of attention, but also the RNA-binding domains that typically encompass one or more RNA recognition modules that are folded?

Rohit Pappu (00:44:53):

And interspersed among them or between them are these so-called RGG domains. If you go looking for an RGG motif, you will never find them in the RGG domains, but typically, there is an arginine-glycine richness to them. Notorious candidates, FUS, Fused in Sarcoma, EWSR1, most of you who are aficionados of condensates will of course recognize this. And so, a few years ago, Tony Hyman, Simon Alberti and I started talking about this. And they were working on some beautiful sort of dissections of what’s going on with full-length FUS, driven entirely by an amazing postdoc Jie Wang, who is now actually in Rick Young’s lab, and my former postdoc Jeong-Mo Choi, who basically sort of became the sequence gazer and the field theory sort of person who helped think through these data.

Rohit Pappu (00:45:50):

So let’s start with FUS. So basically, now, you take full-length FUS. These are experiments done in 150 millimolar potassium chloride at room temperature. You basically start to crank up the concentration and as you go past some threshold concentration around… so notice that there were a few points missing, but of course, it turns out that the threshold concentration is actually between three and five micromolar, you start to see condensates forming. You can go back and do due diligence and actually quantify the fact that, indeed, the threshold concentration is somewhere here, that’s called the saturation concentration.

Rohit Pappu (00:46:31):

Let’s go back and then ask how these thresholds or saturation concentrations, A, do they depend on sequence, or are they basically the same for every intrinsically disordered protein that has a PLB and RBD? The answer is no, they are very much dependent on sequence. But what’s also interesting is that there is an interesting correlation between the amount of protein that you will actually produce under, let’s say, basal conditions in your typical cell as measured here sort of in maroon versus the saturation concentrations shown in blue.

Rohit Pappu (00:47:07):

Now, there is still an order of magnitude discrepancy, but it turns out that when you start thinking about the inclusion of RNA, now you bring the threshold concentration pretty much into the in vivo regime. So there are two things going on, what’s sequence intrinsic and the heterotopic interactions, which we’re going to discuss in the next lecture. So basically, what this says is that when you start thinking about full-length FUS, full-length EWSR1, life becomes much more relevant and interesting. So the question then was what are the determinants of the sequence-specific saturation concentration values?

Rohit Pappu (00:47:43):

So, as we were thinking this through, Alex Holehouse and Jeong-Mo Choi basically worked together on sort of dissecting the human proteome of all the IDRs and asked the question of, is there anything distinctive about, let’s say, the sequence signatures of the compositional biases in these FET family proteins? And the thing that jumped out is that while it is generally the case that these disordered regions don’t have a whole heck of a lot of tyrosine and arginine residues together, and the and is important. The FET family proteins really jump out as being distinctive for their sort of high tyrosine and arginine content. Here is TAF15, for example, here is EWSR1, here’s FUS, and so on and so forth.

Rohit Pappu (00:48:32):

So, A, these high frequencies are uncommon, B, they seem to be a distinctive signature of these FET family proteins. Does it matter? So then what Jie set about to do was basically start to do a molecular dissection experiment, where basically, you measure the saturation concentration, in this case, at 75 millimolar potassium chloride. You ask, under these conditions, does one observe phase separation of the PLD alone? The answer is, no. What about the RBD alone, which encompasses the RRM and the RGG region? The answer is, no. Put the two in trans to one another, you actually start to see condensates forming, albiet, at a concentration that is higher. So clearly, there’s a cooperativity that is encoded by tethering these together that has entirely to do with the overlap concentration.

Rohit Pappu (00:49:24):

So that then led us to the hypothesis that, hey, the amino acid stickers, in this case, must be tyrosines and arginines. And so, what Jeong-Mo said was, “Okay, we can generalize the beautiful theory of Semenov and Rubinstein to start thinking about heterotypic interactions and ask the following question of whether or not we can develop a theory that would predict how the saturation concentration should vary with the number of tyrosine and arginine stickers. I’ll cut the long story short, and basically give you sort of the final answer, which is that you can basically demonstrate that the saturation concentration will be inversely proportional to the product of the number of tyrosines and the numbers of arginines, zeroth order field theory.

Rohit Pappu (00:50:17):

And effectively, it turns out that the measured saturation concentrations do indeed correlate with this inverse product quite strikingly well. And this is now spanning two to three orders of magnitude, so this is not some accidental fit. Jeong-Mo has basically then built on this theory to account for both homotypic and heterotypic interactions. Because the key question to ask is how much stronger should a tyrosine-arginine interaction be over a tyrosine-tyrosine interaction in order to see this selectivity for tyrosine-arginine interactions? And Jeong-Mo’s new theory basically makes the point that an advantage of about 0.1 to 0.2 KT is more than sufficient. So it’s really, in this case, not in others, weak distributed multivalent interactions turns out to be what’s governing the driving force for phase separation. And this paper is currently under review, one can access this at the physics archive.

Rohit Pappu (00:51:19):

Now, the minute you see arginine-tyrosine, all physical chemists will say, “Ah, well, you’re not teaching us anything new, you’re telling us about cation−π interactions.” Well, okay, good, I’m glad all this is well known, I think there was some heretic on Twitter saying that he was going to teach all of us P-Chemistry. So thank you, sir, we know our P-Chemistry. But the way we want to think about this is we want to think about this in terms of: are the different cations and π-systems different from one another? So one can think about this in the classical sense or one can think about this in terms of molecular orbitals. So one can think about sort of these cations having a net charge, there could be a net dipole moment. And then the way the electron cloud is delocalized one can also have a quadrupole moment.

Rohit Pappu (00:52:08):

The reason this matters, is one essentially, now, starts to get a hierarchy of interactions, which means that, remember the point I made about heterogeneity, that will be prevalent here as well, where we’ll start to get a range of interaction strengths and a range of interaction ranges, as we think about charge-charge, charge-dipole, and so on. So that led us to the idea that hey, the intrinsic multiple moments must be different from one another. So tyrosine, which has this in-plane dipole should have a high dipole moment and a high quadrupole moment, this is from the National Institute of Standards. Phenylalanine should only have a large quadrupole moment. Arginine and lysine should be different from one another because, of course, arginine has this sort of Y aromaticity that gives it this sort of fork-tongued-like arrangement of the electron cloud giving it actually a rather whoppingly big quadrupole moment, whereas lysine really is nature’s best approximation as far as the amine is concerned of a point charge.

Rohit Pappu (00:53:12):

So this led to the hypothesis, and then we tested it, that tyrosine and arginine should make for stronger stickers than phenylalanine and lysine. So here are the saturation concentrations that Jie measured at 75 millimolar potassium chloride down in the close to about between two and three micromolar. Change all the tyrosine’s to phenylalanine, that increases to close to 10 micromolar, suggesting that the substitution has weakened the sticker’s strength. Replace all of the arginines in the disordered region with lysines, same effect, you effectively weaken the driving forces. Do them together, it becomes mostly additive, it turns out that there is a minor electrostatic addendum that shows up here that makes it slightly non-additive, it disappears when you increase the salt concentration.

Rohit Pappu (00:54:03):

So effectively, what this starts to say is that if we start to account for the cationic and aromatic stickers, we start to account for this hierarchy of interaction ranges and strengths, something that Julie Forman-Kay’s lab has actually identified and codified into a predictive ability, actually, first by Tim Nott when he was a postdoc with her, and then eventually in a series of different papers. And we worked recently and I’ll discuss this work next week with Greg Jedd, in Singapore, showing that this difference between arginines and tyrosines, sorry, lysines, turns out to be materially important for discriminating the sequence grammar of nucleoli versus speckles, for example.

Rohit Pappu (00:54:49):

So essentially, I will conclude on this note that there are two ways with IDPs in which you can actually start to sort of tune the phase behavior. Of course, if you increase the valence of stickers and crank up the interaction strength, eventually you will start getting super stickers. And if you have only stickers, you’ll get the polyQ-like behavior, which is, you’ll start to worry about precipitation, aggregation, solid formation. If all you have are spacers, then basically what you’re going to do is just get spacer-driven percolation, so essentially, now, you’re talking about molecules behaving like they’re in polymer melts. And this Goldilocks phenomenon essentially says that there is sort of a beautiful balancing both of the valence of stickers, and the dilution effect of spacers, and the patterning of the stickers to ensure that the stickers don’t become super stickers, thereby giving us more viscous or viscoelastic network fluids.

Rohit Pappu (00:55:52):

So I’ll end and point out that I think I’ve called out everybody as I’ve gone along. I started with describing what Kiersten did, followed by Xiangze and then key work by Jeong-Mo and Alex and then, of course, brilliant collaborations with Hilal, Ed, Tony and Simon are just fantastic. And like I said, Tanja and I have been working together for the better part of the past five and a half years. And next week, I’ll talk about building on these ideas and go largely into unpublished data, but for now, I’ll thank you all, stop and take questions and I think I gave enough time.

Mark Murcko (00:56:30):

Thanks, Rohit, excellent as always, wonderful stuff. We do have a number of questions that have come in and I know we may start to lose some of the audience but let’s take a few questions now. And then as you said, Rohit, you’ll be able to answer additional questions in the next week or so, that’ll be great. There’s a question that came in from Satya Pandey and also Charlotte Fare, related questions really asking about what… it builds on the importance of valence and patterning that you brought up the idea that, will these models work in all cases or are there going to be situations where these models may or may not work? Maybe I turn it over to Charlotte Fare, if she’s still online, to ask her question.

Charlotte Fare (00:57:19):

Hi, I’m here, can you hear me?

Rohit Pappu (00:57:21):

Yes, Charlotte, hello.

Charlotte Fare (00:57:23):

Hi, so you sort of alluded to this with your images of the Patchy versus the well-dispersed hnRNPA1 where you see differences in how they sort of condense or aggregate, but how well does the Flory-Huggins model still apply to those extreme cases where you lose the even patterning and sort of how do you think one could adapt the model or sort of play with that Flory-Huggins equation to more accurately represent a protein like FUS where you have sort of two different separated chunks that are interacting?

Rohit Pappu (00:58:15):

Right, that’s an excellent question. So I should point out that, yes, you’re absolutely right that we deploy the Flory-Huggins on the LCD system. As far as we’re concerned, we think that it might be time for us all to come together and acknowledge that Flory-Huggins is a beautiful phenomenological model that maybe we set aside for now because there are actually better ways to think about all of the beautiful sequence heterogeneity. I think fits are great, but often what ends up happening with the Flory-Huggins model is that the parameters that you get… because the model has some free parameters, right? And if I sort of walk you through here, actually let me do this in a more efficient way so I can find that slide and sort of that way I’m not just critiquing something blindly. So here we go.

Rohit Pappu (00:59:12):

So effectively, if you notice, Charlotte, you’ll see that there’s this chi parameter that we don’t know a priori. There is the W parameter that we don’t know a priori. So the way to use Flory-Huggins is, generally, in a postdiction model where you have to have the experimental data then you fit and then you ask, “Oh, can I look at the value of chi? Can I look at the value of W and ask how they change across different sequences?” A, that’s not predictive, B, that’s generally fitting and, C, what invariably happens is that the values that we obtain, basically, make zero sense. I mean, they kind of give you absurdly different values and that’s largely because it’s a purely mean-field model.

Rohit Pappu (00:59:58):

Two things that we need to actually acknowledge is that nowhere in this equation do we actually say that this is a polymer that involves residues connected to one another. The only place that shows up is in the one over N term. That is simply saying that the translational entropy of what is otherwise a gas of beads, is somehow diminished by the molecular weight of the polymer and that’s it. And I realize that I’m sounding pretty harsh and I’m not even a trillionth as clever as Flory or Maurice Huggins. My point is that, in 2020, we can do better.

Rohit Pappu (01:00:38):

And what does better look like? I think at this juncture, you want the theory that basically has two pieces to it. It describes the single-chain behavior from which you can extract the relevant parameters, and basically, start to get at the phase diagram as well. And so, the direct answer to your question is right now the Gaussian Cluster Theory does exactly that because what it does is it describes the free energy of a single chain in terms of the Gaussian distribution of the sort of the Rg values that a chain can actually adopt. It recognizes that that distribution will evolve as a function of temperature, depending on the strengths of the effective two and three-body interaction coefficients.

Rohit Pappu (01:01:27):

It also recognizes that it can be generalized, actually, and it turns out that I recently discovered this, that Mike Cates and Tom Witten had actually developed a precursor of the Gaussian Cluster Theory for stickers and spacers back in 1978, which Xiangze is now working on to do exactly what you’re looking for, Charlotte, and the way we would approach this is to do the following: Either pursue the experiments in the type of single-molecule measurements that, let’s say, Ben Schuler does or the types of SAXS and NMR measurements that Tanja would do to get at a detailed characterization of the single-chain behavior from which we can actually extract the two and three-body interaction coefficients, recognizing that these might have to become matrices to account for the heteropolymeric nature and then you basically calculate your underlying phase diagram.

Rohit Pappu (01:02:22):

Alternatively, we can do simulations as well. And that’s what’s Xiangze has been doing. And by the way, this works even for systems that have lower critical solution temperatures, but we are also starting to generalize this for changing salt conditions and so on. I realized that was a bit of a long-winded answer but the answer is, we wouldn’t go Flory-Huggins, we’d go GCT.

Charlotte Fare (01:02:43):

Great, thank you so much.

Mark Murcko (01:02:45):

We probably have time for a couple more. So I turned to Satya because she had a different spin on the whole question of patterning and valence, more asking about the biological conservation of those residues.

Satya Pandey (01:03:02):

Yeah, hi, am I audible?

Mark Murcko (01:03:04):

Yes.

Satya Pandey (01:03:06):

Yeah, a really, really nice talk, Dr. Rohit. I had a similar question, but I think you answered some part of it during the subsequent slides. So the question was that do you see a positional effect that dictates that which stickers are sort of more contributing and which are less contributing? And are they important ones more evolutionary conserved? And the second part is more to do with the saturation concentration and the physiological concentration, would that significantly change in the pathological condition or in a very localized manner than the proteins are in a complex, for example, and the number of protein molecules at that particular localized concentration would be significantly higher, right?

Rohit Pappu (01:03:51):

Absolutely, so I’ll take your first question first and I’ll sort of be smiley glib and say stay tuned. So naturally, Tanja’s never happy, so she’s been pushing us to think about exactly the question you asked which is what is the context-dependence of these sticker contributions? And indeed, even in this… this is why we’ve started to refer to this glibly as hidden complexity and low complexity sequences because as you anticipate, Satya, it turns out that the local sequence context does indeed matter. So in a lot of these PLDs, it will turn out that whether you have a phenylalanine or tyrosine, and whether or not though they contribute sort of significantly or weakly, will be governed by the types of residues, and in particular, charged residues that directly sort of flanking them. So we have a paper that we’re putting together that basically dissects this in fairly delicious detail.

Rohit Pappu (01:04:55):

And of course, the RGG domain, I think, I hope that in a year’s time we’ll stop referring to it as the RGG domain because there is a whole lot more complexity to RGGs. And in particular, as you go across different condensates, one of the things that have actually been titrated is the valence of arginines and the sequence context of arginines. Also, stay tuned, Matt King has been sort of really focused in on trying to get that dissected.

Rohit Pappu (01:05:23):

And to your second point, again, you’re absolutely right because the way to think about this is we are right now fixated on drawing one phase boundary. But of course, you can imagine that there are multiple phase boundaries depending on what combination of solution, conditions, and protein concentrations we live in. And so, you could imagine that a condensate actually serves as a crucible by enabling spatial inhomogeneity. So now the local concentration can go really, really high enabling, let’s say, the formation of solids, for example. If, for example, post-translational modifications start to alter the patterning or there are interactions that can actually effectively alter the patterning, because that’s something we know we haven’t yet thought about but you can certainly see how that might happen. After all, condensates are not unimolecular systems in the sense that they’re comprised of multitudes of molecules which we’ll talk about next week.

Rohit Pappu (01:06:20):

And in fact, one of the things that will happen and does happen is that ligands can actually lower the saturation concentrations even further, thereby making this highly physiologically relevant. And that is exactly what you would want, I would argue, because you want regulation over phase behavior. And the way to do that is you encode some of it in the molecule itself, namely the scaffolds, but then you enable the regulators, be they epigenetic changes or ligands to control the phase behavior, because otherwise, you essentially have this thing that goes on and off of its own accord and not your accord (as in the cell).

Mark Murcko (01:07:04):

That’s great, thank you. So I think maybe we just have time for one last question. So I turned to Sneha Roy if you’re still on. A very basic question, I don’t know if he is still on.

Sneha Roy (01:07:25):

Yeah, I’m there, I’m I audible?

Mark Murcko (01:07:27):

Great, yes, perfect.

Sneha Roy (01:07:29):

Thank you so much professor for that great talk. So, yeah, as I’m pretty new to the field. So yeah, my question is, so in a sticker driven reaction, we have precipitates. So I’ve been studying the system which no matter how much I tune the solution by adding salt to weaken the interaction, it directly goes to a soluble state, so there are no sign of droplets. So how can we explain this phenomenon?

Rohit Pappu (01:07:55):

Yeah, and I think at least in a stickers and spacers framework, the way I would say this is that you may well have the requisite valence but in this particular case, the sequence context of those stickers, namely the spacers, are basically ensuring that you’ve effectively diluted the effects of these interactions. And so, I would imagine that there are two possible scenarios, without knowing the details of the sequence that you are talking about. But since I think I know who you are, Sneha, you are talking about gamma- and beta-synuclein. And so, this is something that is kind of interesting because if you compare the sequence architecture of gamma- and beta- to alpha-synuclein, effectively, alpha-synuclein is very much like our Aro Patchy in the sense that it is much more super stickery and whereas gamma, in particular, doesn’t have that particular feature. We’ve got some simulation results if you wish to talk about them in detail, I’m more than happy to share.

Sneha Roy (01:09:00):

Yeah, sure, thank you so much, professor.

Mark Murcko (01:09:04):

Great, so I think we’ll wrap up here. So, Rohit, thanks in advance for all the additional answers to the other questions in the chat. We’ll certainly post those on condensates.com along with the lecture. And thanks, again, for a fantastic talk and for everybody still on the call, remember that we’ll do the last lecture one week from today, next Wednesday, the 12th of August at the same time. So look forward to seeing all of you then, and tell your friends–let’s get everybody on the call. Thanks, Rohit.

Rohit Pappu (01:09:36):

And I should say, thank you all for coming aboard, this has been lots of fun. Thank you, Mark.

Mark Murcko (01:09:39):

Great, thank you, Rohit. Thanks, everyone. Bye-bye.

### EXTENDED Q&A

**Question from Achuthan Raja Venkatesh:** Professor Pappu, I have the following question with regards to discussions from the previous session: To what degree can condensates function as entropy reservoirs, that help compensate for unfavourable biomolecular reactions?**Rohit’s Response:** The question, if I understand it correctly, appears to be about being able to enhance the efficiencies of biochemical reactions can be enhanced…

This idea was first articulated and demonstrated in the field of biomolecular condensates by the Rosen lab in their 2012 Nature paper. There are issues that one has to consider. Do reactants find one another in condensates via diffusive motions or do the kinetics of reactions change? This remains an unresolved issue and in this context, it might be worth revisiting the work of Raoul Kopelman. What we need to measure / compute are the efficiencies of biochemical reactions within condensates when compared to the efficiencies in bulk solution, recognizing that the cytoplasm and / or nucleoplasm are not truly dilute solutions. *Of course, it is more than likely that I have not understand the question. If the answer provided here is unhelpful, let us revisit the discussion via email.*

**Question from Satya Pandey:** Is it just the number of aromatic residues that dictate the expansion or there is also a “positional effect” that dictate which stickers are most contributing and which are less contributing? And are the important ones more evolutionary conserved?**Rohit’s Response:** The zeroth order effect is the valence (number) of aromatic stickers. As the patterning studies reveal, there is a clear contribution from the effects of linearly clustering stickers vs. distributing them uniformly along the sequence. As for the specific issue of whether each sticker might contribute differently to the overall energetics, this remains a distinct possibility and ongoing work in collaboration with the Mittag lab suggests that this is indeed the case. Whether or not the contexts of specific aromatic moieties are evolutionarily conserved remains to be determined. Here, the ongoing efforts of Alan Moses and colleagues in Toronto and emerging efforts in the Drummond lab as well as the Holehouse lab are likely to be informative in analyzing evolutionary conservation.

**Question from Guanhua He:** How FCS measure mM level concentrations? Did he label a small portion of proteins and assume equivalent partitions in condensate and dilute phase?**Rohit’s Response:** Please see our recent Science paper for the relevant details regarding the FCS measurements in the supplemental material.

**Question from Charlotte Fare:** Do you think that the F-H model will work for most proteins with IDRs, or will there be cases where a single chi value/homopolymer model will not work? perhaps a protein where the “stickers” aren’t so regularly spaced?**Rohit’s Response:** Adaptations of the Gaussian Cluster Theory might be more informative because it is a theory that describes single chain conformational equilibria and phase equilibria. Please see our recent adaptation of this theory in a practical application. Certainly, one of the findings from our collaborative work with the Mittag lab was that one can use the goodness of fit based on Flory-Huggins theory as a diagnostic of the type of patterning, well mixed vs. segregated linear distribution of stickers. The challenge however lies with the use of goodness of fits because one can, in theory, always find a good enough fit providing one is willing to allow for rather non-physical values for the two- and three-body interaction coefficients. In fact, we recently noticed that making the apparent molecular weights be a dynamical variable truly improves the overall fits, but the physical interpretation becomes questionable. Furthermore, the FH theory can only be used post facto to fit data from experiments and it is difficult to use it make predictions for specific systems. The random phase approximation of Hue Sun Chan and coworkers and the elegant work being done by Charles Sing and Sarah Perry on the physics of complex coacervation seem like very promising theoretical approaches.

**Question from Leon Babl:** Many proteins that form condensates have been shown to be highly sensitive to ionic strength changes. This suggests that many stickers are rather electrostatic interactions than aromatic ones. Did you perform similar experiments (especially the spacing/patterning) with proteins known to be highly sensitive to ionic strength (e.g. DDX4 …) and would you expect similar results? **Rohit’s Response:** The effects of patterning of charged residues was investigated by Nott et al., in 2015 for the IDR of DDX4. We worked with the Rosen lab on a different problem, interrogating the effects of charge patterning on phase separation via complex coacervation driven by the Nephrin intracellular domain. We developed the concept of charge interacting elements to highlight the contributions of linear clusters of like charges as enhancers of phase separation via complementary electrostatic interactions. In this context, it is noteworthy to mean our earlier work on the effects of charge patterning on the sequence-ensemble relationships of archetypal IDPs and their contributions to IDP functions (see here and here). The phase behaviors of model, well mixed vs. blocky polyampholytes that we introduced in our 2013 paper have been studied using theory and simulation by the groups of Hue Sun Chan and Joan Emma Shea. However, there is an interesting fly in the ointment as we recently noted in collaboration with Greg Jedd that might be worth reading about.

**Question from Pinaki Swain:** Is the percolation concentration from mean field theory is used as a proxy for saturation conc. here?**Rohit’s Response:** Yes it is.

**Question from Erik Martin:** Do you think that the patterning effect in the A1 LCD is driven by effectively combining stickers or by differences in the entropy of confining short versus long spacers? If the latter, could the aggregation be mitigated by changing the nature of the spacers?**Rohit’s Response:** This is an interesting question. A cursory glance at the length distribution of spacers would suggest that the distribution of spacer lengths has not changed dramatically in Aro^{patchy}. So, to first order, the conjecture is that it is the creation of *super stickers* that leads to the precipitation behavior. However, the intriguing question is if alterations to the spacer properties can in fact alter the underlying phase behavior. This should be possible and, in this context, getting to know the details of the preferred sequence features of spacers becomes quite important. The particular challenge might be the prospect of bypassing liquid-like phases altogether such that we end up with either precipitation or gelation without phase separation.

**Question from Sneha Roy:**Thank you for a wonderful talk. I have a naïve question as I am new in the field. In a sticker driven interaction, which gives rise to precipitates, what could be the explanation for the solution to never go to a droplet state and become completely miscible upon tuning the solution condition, say using salt to weaken the interaction?

**Rohit’s Response:**An extreme way to achieve this would be to change solvent quality. Certainly, changes to salt concentrations or salt types will help accomplish the type of dissolution you propose. Adding denaturants would be another way.

**Question from Ann Kwong:**There are situations where a single amino acid change in an IDR correlates with disease, are all of these single aa changes correlated with PTM or can a single aa change affect phase behavior by itself. In other words, does the sticker affect apply at a single aa change in a complex situation?

**Rohit’s Response:**This is a

*really important*question. The relative importance or unimportance of single amino acid changes will be governed by the contributions of sequence heterogeneity. In a purely binary setting, where we treat all stickers as being the same, and all spacers as being the same, the only way for a single sticker substitution to impact phase behavior would be via a change that either alters the valence or interaction strengths of stickers. These effects are then exponentiated in the context of collective interactions among IDRs. However, the relative importance of stickers can also be modulated by the local sequence contexts of stickers, whereby the effects of some mutations are buffered whereas some others are amplified.

**Question from Charles Phillips:**Thank you for the talk! I am also new to this field and have a question about the physiological salt conditions. In terms of the droplet formation do you see a difference when using different salt species such as NaCl and KCl?

**Rohit’s Response:**There are ion specific effects and the Fawzi lab has published some results highlighting these effects. We have not yet published work that focuses on a systematic dissection of the effects of different types of solution ions on the phase behaviors of IDRs.

## Join the conversation