Molecular Space logo

About the Data

The Clean Energy Project Database (CEPDB) can be queried for a number of properties relevant to organic electronics. In the following we will discuss the search fields as well as details of the reported properties and parameters.

Note: for the initial CEPDB release we have restricted the accessible data to the most relevant entries for organic photovoltaic (OPV) applications. Once the performance and stability of this website are confirmed, we will publish additional data tables in short succession. We will also regularly add data on new compounds as screening calculations are completed.

Searchable Data Fields

Principal energy levels in organic semiconductors:

The reported values for εHOMO and εLUMO are our best available estimates, based on Kohn-Sham eigenvalues that have been processed using the calibration and averaging scheme discussed below. εgap is computed as εgapLUMOHOMO.

Photovoltaic performance parameters:

  • PCE: the power conversion efficiency
  • Voc: the open-circuit voltage
  • Jsc: the short-circuit current density

The reported values for PCE, Voc, and Jsc correspond to the results of the standard Scharber model and the best estimates for εHOMO and εLUMO.

Other data and information:

Calibration and Averaging Scheme for Energy Levels

To correct for some of the systematic errors in each theoretical model and to bridge the gap between theory and experiment, we have introduced an empirical calibration of the computational results. Such a calibration is a pragmatic way to approximately correct for differences in experimental and theoretical property definitions, as well as in vacuo versus bulk, and oligomer versus polymer results. We have compiled a training set of experimentally well-characterized organic electronic materials for the calibration. The current calibration is largely based on data from bulk-heterojunction solar cells with PCBM as the acceptor material. This introduces a corresponding bias.

In order to obtain more robust values and reduce random errors introduced by potential failures of any individual model chemistry or calibration for particular data points, we average over all the independently acquired and calibrated DFT results. We also average over the different geometries (i.e., essentially over different conformers) available for each molecular motif. These overall averages can thus include up to 75 values. The calibrated and averaged results are our best estimates for the data of interest.

Scharber Model for OPV Donor Material Performance Prediction

The Scharber model is a specialized version of the Shockley-Queisser model for OPVs, developed against bulk-heterojunction polymer-fullerene solar cells. The only inputs it requires are the HOMO and LUMO energies. In the presented analysis we utilize our best estimates for these values, as well as the standard parameters, i.e., a fill factor of 65%, a uniform external quantum efficiency of 65%, a required LUMO offset of 0.3eV between donor and acceptor for charge transfer, an empirical loss parameter of 0.3eV, and a PCBM acceptor counterpart with a LUMO energy level of -4.3eV.


We emphasize that the predictions from the Scharber analysis are subject to the limitations of this relatively simple model, the various assumptions that go into it, and the approximations that are implicit in the input data provided by the approach described above. The resulting PCE values should be interpreted as the performance potential of a compound that may be achieved, if the assumptions used in the Scharber model can be met. These assumptions implicitly incorporate a number of additional requirements — in particular related to the complicated bulk and interphase behavior as well as to the exciton and charge carrier dynamics — that have to be achieved in order to obtain a high-performance material. The standard parameters which reflect these assumptions can in principle be further improved, but their practical realization already poses challenges. While the Scharber model is clearly too simplistic to account for all the complex physics of an OPV explicitly, it nonetheless provides a valuable indication about the inherent promise of a candidate compound. A good PCE value is thus a necessary condition for a successful donor material in such a solar cell, but not a sufficient one. It offers a guideline as to whether development efforts geared towards realizing the other material features have a chance to be worthwhile. However, there is no guarantee that our top candidates will indeed perform as well as indicated since they may fail for factors not captured in the employed analysis. Pharmaceutical screening efforts are a good analogy to the CEP efforts: our calculations reveal insights into new and potentially successful molecular patterns which can be further explored by experiment and more detailed calculations. We are also actively working on more sophisticated ranking models based on machine learning approaches.


The data in this database is released under the Creative Commons Attribution ShareAlike license and therefore any academic publications associated with CEPDB should be appropriately cited.


The technical description above was adapted from our upcoming paper on “Lead Candidates for High-Performance Organic Photovoltaics from High-Throughput Quantum Chemistry – the Harvard Clean Energy Project” as well as from our 2011 paper on “The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid” in The Journal of Physical Chemistry Letters. Please also consider all the work by others cited therein.

© 2013 Molecular Space
Design: S. Valleau and C. Román