Return to Product

Research lab

Apromore lives and breaths research. Our software platform is directly driven by our own research and informed by the latest developments in the process mining academic community, and more broadly in the business process management community.

This page lists standalone distributions of those experimental tools that are being developed by the Business Process Management Team at The University of Melbourne, in collaboration with the Software Engineering & Information Systems Group at the University of Tartu. Many of these tools get later refined and improved to become experimental plugins on top of Apromore Community Edition, and may one day be productized into commercial-strength plugins for Apromore Enterprise Edition.

The experimental tools listed below may not be in their latest version. This happens when the tool is turned into a plugin for Apromore, after which we usually maintain the latter but not necessarily the standalone distribution. Please ask the tool authors before using a given tool for experimentation as they may point you to a more recent version or known bugs.

  • Automata-based Behavioral Precision (ABP)

    (by D. Reissner, A. Armas-Cervantes, R. Conforti, M. Dumas, M. La Rosa, A. Augusto)
    Given a BPMN model and an event log in XES or MXML format, this tool returns the precision of the model w.r.t. the log. Precision measures the extent of extra model behavior that is not recorded in the log. The measure ranges from 0 (highly-imprecise model) to 1 (highly-precise). Specifically, this measure is computed by relying on a lossless representation of the log and of the model behavior based on automata, which are then compared by simulating their behavior using a product automaton.

    A companion metric based on Markovian abstraction is available here.

  • Automatable Routines Discoverer

    (by A. Bosco, A. Augusto, M. Dumas,  M. La Rosa, and G. Fortino)
    This tool allows one to analyze user interaction (UI) logs in order to discover sequences of actions (i.e. routines) that are fully deterministic and can thus be automated using such tools. The tool losslessly compresses the user interaction log into a Deterministic Acyclic Finite State Automaton (DAFSA). It then applies an algorithm to decompose biconnected graphs (of which a DAFSA is an exemplar) into Single-Entry Single-Exit (SESE) regions. Some of these SESE regions correspond to sequences of actions. For each such sequence, the tool checks that each action is deterministic. If each action of the sequence is deterministic, the method tries to discover an activation condition for each sequence of deterministic actions using a rule mining technique. Instead, if an action in the middle of the sequence is not deterministic, the sequence is split into subsequences (called subroutines) for which the tool tries to discover activation conditions separately. For each (sub)sequence for which a rule is found, an activation condition is defined, and a routine specification is generated. The tool outputs the list of routine specifications.

  • BPMN Miner 2.0

    (by R. Conforti, A. Augusto, M. Dumas, L. Garcia-Baneulos and M. La Rosa)
    BPMN Miner is a tool for the automated discovery of maximally-structured, hierarchical BPMN models containing subprocesses, interrupting and non-interrupting boundary events and activity markers. The tool works on top of a range of flat process discovery algorithms: Heuristics Miner, InductiveMiner, Fodina, ILP Miner and the Alpha algorithm. It employs functional and inclusion dependency discovery techniques in order to elicit a process-subprocess hierarchy from the event log. It requires as input a log in the XES or MXML format, and produces a standard BPMN 2.0 model (.bpmn) as output. The tool will identify inclusion dependencies from the log, and ask the user to validate these dependencies before proceeding with the mining of the BPMN model. The identification of the inclusion dependencies in noise-tolerant. Moreover, the tool integrates Structured Miner, meaning that it returns a maximally structured BPMN 2.0 model by combining BPStruct and Extended Oulsnam Structurer (both used with default settings). BPMN Miner has been integrated as a plugin into Apromore.

  • BProVe

    (by F. Corradini, F. Fornari, A. Polini, B. Re, F. Tiezzi, A. Vandin, M. La Rosa)
    BProVe is a tool supporting the automated verification of BPMN collaboration models. The analysis is based on a formal operational semantics defined for the BPMN 2.0 modelling language, and is provided as a freely accessible service that uses open standard formats as input data. In particular, BProVe permits to analyse correctness of models with respect to domain independent properties, such as soundness and safeness, as well as domain dependent properties, e.g. checking the correct exchange of messages or the proper evolution of process activities. BProVe provides diagnostic information that can be easily reported on the diagram in a way that is understandable by process stakeholders. This is especially useful when different parties, with different background, need to quickly interact on the base of a model. From a technical point of view BProVe is based on a running instance of MAUDE loaded with the MAUDE modules implementing the BPMN Operational Semantics and the LTL MAUDE model checker. BProVe has also been integrated as a plugin into Apromore. This way it is possible to check BPMN model correctness from within the Apromore Editor.

  • Business Process Clone Detector

    (by R. Uba, M. La Rosa, L. Garcia-Banuelos and M. Dumas)
    Business Process Clone Detector is a command-line tool for detecting duplicate fragments (a.k.a. clones) in repositories of process models. The tool works with a collection of EPC models as input (at least two models) and returns a DOT image for each identified clone. These images can be opened with ProM 5.2 ( It is possible to choose the minimum size of a clone, which is 4 nodes by default.

    Source code (provided “as is”, under LPGL v3.0)

  • Infrequent Process Behavior Filter

    (by R. Conforti, M. La Rosa and A.H.M. ter Hofstede)
    The analysis of business process event logs can be negatively influenced by the presence of outliers, which reflect infrequent behavior or “noise”. In process discovery, where the objective is to automatically extract a process model from an event log, this may result in rarely travelled pathways that clutter the process model. The Infrequent Process Behavior Filter automatically filters out infrequent behavior while minimizing the number of events being removed from the log. The tool accepts as input an event log in XES or MXML format and provides a filtered log in output. This tool has been integrated into Apromore as part of the BPMN Miner plugin, where users can choose whether to filter the input log before process discovery.

  • Markovian Fitness and Precision (MFP)

    (by A. Augusto, A. Armas-Cervantes, R. Conforti, M. Dumas, M. La Rosa, D. Reissner)
    Given a BPMN model and an event log in XES or MXML format, this tool computes the fitness and the precision of the model w.r.t. the log. Fitness measures how much of the behavior recorded in the log can be replayed by the process model. The measure ranges from 0 (highly unfitting model) to 1 (fully fitting model). Precision measures the extent of extra model behavior that is not recorded in the log. The measure ranges from 0 (highly-imprecise model) to 1 (highly-precise). Specifically, these measures are computed by relying on the Markovian representations of the log and of the model behavior, which are then compared using graph-edit distance. The Markovian representation may be lossy depending on the k-order parameter.

    A companion metric based on automata abstraction is available here.

  • Multi-Perspective Process Comparator (MPC)

    (by H. Nguyen, M. Dumas, M. La Rosa, A.H.M. ter Hofstede)
    Existing approaches to log-based process variant comparison are restricted to intra-case relations, and more specifically, directly-follows relations such as “a task directly follows another one” or a “resource directly hands-off to another resource” within the same case. This tool implements a more general approach based on so-called perspective graphs. A perspective graph is a graph-based abstraction of an event log where a node represents any entity in an event log (task, resource, location, etc.) and an arc represents an arbitrary relation between these entities (e.g. directly-follows, co-occurs, hands-off to, works-together with, etc.) within or across cases. Statistically significant differences between two perspective graphs are captured in a so-called differential perspective graph, which allows us to compare two event logs from any given perspective. The tool is packaged as a standalone ProM distribution containing the plugin called “Multi-Perspective Process Comparator”. The input to the plugin are two event logs and user-defined parameters for comparison. The output is a matrix-based visualization of differences between the two logs. Detailed usage instructions are documented in the tool manual.

  • Nirdizati

    (by A. Rozumnyi, I. Verenich, M. La Rosa, M. Dumas, F. Maggi, I. Teinemaa)
    Nirdizati is a dashboard-based monitoring tool which is updated periodically based on incoming streams of events. However, unlike classical monitoring dashboards, Nirdizati does not focus on showing the current state of business process executions, but their future state (e.g. when will each case finish). On the backend, Nirdizati uses predictive models trained using machine learning methods, including deep learning. Currently,  Nirdizati is processing two predefined event streams corresponding to the Business Process Intelligence Challenges (BPIC 2012 and BPIC 2017). Both logs originate from a financial institute and pertain to a loan application process. For the 2012 BPIC, we are using a classification model to predict whether the case duration will be within a certain threshold and a regression model to predict the remaining cycle time of an ongoing case. In addition, for the 2017 BPIC, we predict whether a customer will accept a loan offer via a classification model. All the predictions are updated automatically as new events arrive.

  • Optimization Framework for Automated Process Discovery

    (by A. Augusto, M. Dumas, M. La Rosa, S.J.J. Leemans, S.K.L.M. vanden Broucke)
    This tool implements four optimization meta-heuristics (i.e. iterative local search, repetitive local search, tabu search, and simulated annealing) to optimize the automated discovery of process models. Given an event log as input (in MXML, XES.GZ or XES format), the tool allows to optimize one of the three automated process discovery algorithms implemented within the framework: Split Miner, Fodina, or Inductive Miner. The optimization is driven by the selected metaheuristic (out of the four available). Precisely, the metaheuristic explores (in a pseudo-random manner, according to the selected meta-heuristic) the solution space looking for the process model that scores the highest Markovian fitness and precision values. The exploration of the solution space ends when a timeout or a maximum number of exploration iterations have been reached. The tool outputs the best process model found during the solution space exploration, in BPMN 2.0, which can be opened and visualized using different tools, such as Apromore.

  • OptimizeKnockout

    (by I. Verenich, M. Dumas, M. La Rosa, F.M. Maggi and C. Di Francescomarino)
    OptimizeKnockout is a tool for finding an optimal ordering of check activities in a so-called “knockout section” of a business process in order to minimize overprocessing. Overprocessing waste occurs in a business process when effort is spent in a way that does not add value to the customer nor to the business. A recurrent overprocessing pattern in business processes happens in the context of “knockout checks”, i.e. activities that classify a case into “accepted” or “rejected”, such that if the case is accepted it proceeds forward, while if rejected, it is cancelled and all work performed in the case is considered unnecessary. Thus, when a knockout check rejects a case, the effort spent in other (previous) checks becomes overprocessing waste, according to the Lean classification. This tool implements a fine-grained approach to reorder knockout checks at runtime based on predictive machine learning models.

  • PNSA Algorithm

    (by M. Gambini, M. La Rosa, S. Migliorini and A. ter Hofstede)
    PNSA Algorithm is a command-line tool for automatically fixing unsound Workflow nets. The core procedure is a heuristic optimization algorithm inspired by the dominance-based Multi-Objective Simulated Annealing procedure. Given an Workflow nets and the output of its soundness check, at each run, the algorithm generates a small set of alternative models (“solutions”) similar to the original model but containing fewer or no behavioral errors, until a maximum number of desired solutions is found or a given timeframe elapses. These solutions are produced by applying a number of controlled changes on the current solution, which in turn is derived from the original model. The similarity of a solution to the original model is determined by its structural similarity and (to remain efficient) by an approximation of its behavioral similarity to the original model. Since the intentions of the process modeler are not known and there are usually many ways in which an error can be corrected, the algorithm returns several non-redundant final solutions (i.e. no solution is worse than any of the others). The differences between these solutions and the original model can then be presented to a process modeler as suggestions to rectify the behavioral errors in the original model.

  • Predictive Business Process Monitoring with LSTM

    (by N. Tax, I. Verenich, M. La Rosa and M. Dumas)
    This tool can be used to perform the following prediction tasks: i) prediction of the next type of activity to be executed in a running process instance; ii) prediction of the timestamp of the next type of activity to be executed; iii) prediction of the continuation of a running instance, i.e. its suffix; and iv) prediction of the remaining cycle time of an instance. The tool trains a Long Short Term Memory (LSTM)-based predictive model using  data about historical process instances. Next, the models are evaluated on running, i.e. incomplete instances. It assumes the input is a complete log of all traces in the CSV format wherein the first column is a case ID, then activity name or ID and finally the activity timestamp. Then, this input log is temporally split on 66% (training set) vs 34% (test set), and on the test set the tool evaluates prediction performance for every size of a partial trace, e.g a test trace cut at the 2nd event, the same trace cut at the 3rd event and so on, along all four prediction tasks.

  • Process Merger

    (by M. La Rosa, M. Dumas, R. Uba and R. Dijkman)
    Process Merger is a command-line tool for merging (C-)EPC process models into a C-EPC process model. This tool accepts two or more models in the EPML format (.epml) and merges them by creating a configurable process model in C-EPC. Nodes that belong to all input models are only taken once, and reconnected to all other nodes that are not in common by means of configurable XOR connectors. It is possible to select the matching algorithm (Greedy or Hungarian) used to determine the mapping between the nodes of the two input models, and to customize the matching thresholds for functions/events and for connectors. To compute the similarity between each pair of input models, the Process Merger tool embeds the Process Similarity tool, which is also available for download separately. Moreover, this tool can compute the digest of a merged model. The digest is a projection of a configurable model where only the nodes that satisfy a given occurrence frequency appear, e.g. all nodes that occur in at least three of the five input models, or all nodes that are in common to all input models. Placeholder nodes may be added to avoid disconnections in the resulting digest. Process Merger has been integrated as a plugin into Apromore.

  • Process Similarity

    (by M. La Rosa, M. Dumas, R. Uba and R. Dijkman)
    Process Similarity is a command-line tool which computes the similarity between two (C-)EPC models based on graph-matching techniques. It is possible to choose the matching algorithm (Hungarian or Greedy) and configure the thresholds for model similarity, label similarity and connector similarity, and the weights for skipped nodes/edges and matched nodes/edges. The result is a value between 0 and 1 indicating the degree of similarity between the two input models. Process Similarity has been integrated as a plugin into Apromore.

  • ProConformance 1.0 (event structures-based)

    (by L. Garcia-Banuelos, N. van Beest, M. Dumas  and M. La Rosa)
    Given a process model and a process execution log, ProConformance 1.0 provides a list of statements in natural language capturing behavior that is present or frequent in the model, while absent or infrequent in the log, and vice versa. This conformance analysis method allows users to diagnose differences between prescriptive process behavior (as captured in the process model) and deviant executions of a process as captured in the log, e.g. for compliance purposes. or between two versions or variants of a process. The model can be provided in BPMN and the log in the MXML or XES format.

    ProConformance has also been integrated into Apromore as part of the Compare plugin. This latter plugin is the most up-to-date and currently maintained version of ProConformance.

    Datasets used in the experiments of the paper “Complete and Interpretable Conformance Checking of Business Processes“.

  • ProConformance 2.0 (automata-based)

    (by D. Reissner, R. Conforti, M. Dumas, M. La Rosa and A. Armas-Cervantes)
    ProConformance 2.0 provides a list of statements in natural language capturing behavior that is present or frequent in the log but not in the model. Moreover, it returns all optimal and one-optimal trace alignments between the log and the model. The difference with ProConformance 1.0 is that the internal structures are based on automata (a Deterministic Acyclic Finate State Automaton – DAFSA is built from the log and a Reachability Graph is built from the process model) instead of event structures. Similar to ProConformance 1.0, the model can be provided in BPMN and the log in the MXML or XES format.

  • ProConformance 3.0 (automata-based)

    (by D. Reissner, A. Armas-Cervantes, R. Conforti, M. Dumas, D. Fahland, M. La Rosa)
    ProConformance 3.0 features a re-engineered ProConformance tool with a number of optimizations. In addition, the package implements several extensions to improve scalability with large datasets. Specifically, there are four options: i) base approach without any extension (Automata); ii) with the S-Components extension (SComp) to tackle concurrent process models; iii) with the S-Components and tandem repeats reduction (TR-SComp) to tackle event logs with lots of repetitions; or iv) a hybrid approach that tries to automatically select the most suitable extension based on the characteristics of the input model and log (Hybrid).

    Datasets used in the experiments of the papers “Scalable Alignment of Process Models and Event Logs: An Approach Based on Automata and S-Components” and “Efficient Conformance Checking using Alignment Computation with Tandem Repeats“.
    Source code (provided “as is”, under Apache v2.0)

  • ProDelta

    (by N. van Beest, M. Dumas, L. Garcia-Baneulos and M. La Rosa)
    Given two process execution logs, ProDelta provides a list of statements in natural language capturing behavior that is present or frequent in one log, while absent or infrequent in the other. This log delta analysis method allows users to diagnose differences between normal and deviant executions of a process or between two versions or variants of a process. The logs can be provided in the MXML or XES format. ProDelta has been integrated into Apromore as part of the Compare plugin.

  • ProDrift 4.5

    (by A. Ostovar, A. Maaradji, M. Dumas, M. La Rosa)
    ProDrift is a fully-automated tool for detecting and characterizing business process drifts. The tool accepts as input a process execution log in MXML or XES format, and performs statistical tests over a stream of runs or a stream of events, obtained by replaying the event log. ProDrift accepts an optional window size (specified as number of traces or events), as well as the possibility of using an adaptive window. If the latter option is chosen, ProDrift will adapt the window size in order to strike a trade-off between classification accuracy and drift detection delay. The output is a list of drifts, each with information on the location in the stream of traces (or events) where the drift occurred, and a list of behavioral relations that have been modified by the drift. Drifts can also be characterized at the level of entire process fragments (i.e. single-entry-single-exit sub-processes containing multiple activities and gateways), in which case the tool will return one or more statements in natural language, describing the fragments being affected by the drift and how they have been changed. ProDrift has also been integrated as a plugin into Apromore.

    Synthetic logs used in “Robust Drift Characterization from Event Streams of Business Processes”
    Synthetic logs used in “Fast and Accurate Business Process Drift Detection”
    Synthetic logs used in “Detecting Drift from Event Streams of Unpredictable Business Processes”
    Source code (provided “as is”, under LPGL v3.0)

  • ProLoCon

    (by A. Armas Cervantes, M. Dumas and M. La Rosa)
    ProLoCon is a command line tool for the computation of local concurrency oracles out of event logs. Given an event log, the tool constructs a state space representing the behaviou captured in the log and identifies parts within such state space, referred to as scopes, where concurrency relations between pairs of events hold. The state space abstracts the behavior in the log as an acyclic transition graph, where every vertex in the graph denotes an execution state and every transition denotes an event occurrence. Then, a scope is a pair of vertices (execution states) where pairs of events can occur concurrently. The current version of the tool uses the Alpha algorithm for the computation of the concurrency relations between events. The input required for the tool is simply an event log in either XES or MXML format.

  • ProSeqPredict

    (by I. Verenich, D. Chasovskyi, M. Dumas, M. La Rosa, F. Maggi and A. Rozumnyi)
    ProSeqPredict is a tool to predict the most likely sequence of activities (trace suffix) that will be executed from a partial process instance (trace prefix), based on the information already available on the prefix as well as on the availability of past traces already executed, which are recorded in an event log. It requires as input an event log in CSV format and the length of the prefix to be used. The tool will predict the most likely suffix for each prefix of that length present in the log.

  • ProVariant

    (by N. van Beest, H. Groefsema, L. Garcia-Banuelos and M. Aiello)
    ProVariant allows the automated generation of declarative specifications from a set of business process variants. It takes as input a set of process models in PNML format and return as output a set of CTL specifications stored in an XML file.

  • Slice Mine Dice (SMD) Process Miner

    (by C.C. Ekanayake, M. Dumas, L. Garcia-Baneulos and M. La Rosa)
    SMD is a tool for mining a collection of process models from a process log. This tool uses a combination of trace clustering and clone detection techniques to mine a process model collection where similar process sections are extracted as subprocesses. The tool requires as input a log, an existing trace clustering technique (different ones can be chosen) and a complexity threshold. The result is a hierarchical process model collection where the size of each process model is bounded by the threshold. As this tool can detect and extract common sections from discovered process models, the resulting process model collection has a smaller overall size and less number of process models compared to a collection of process models obtained with a trace clustering technique under the same complexity bound. Furthermore, identification and extraction of similar sections could facilitate better analysis of the generated process model collection.

  • Split Miner

    (by A. Augusto, R. Conforti, M. Dumas and M. La Rosa)
    Split Miner is a tool for fast mining of simple, accurate and deadlock-free BPMN process models from an event log. The approach works in five steps. The first step discovers the directly-follows graph and identifies loops in the process behavior captured in the input event log. The second step detects parallelism between process activities. The third step filters the graph by removing infrequent behavior. The fourth step detects the split gateways while the last step discovers the join gateways. The event log can be in MXML, XES.GZ or XES format. The output model, in BPMN 2.0, can be opened and visualized using different tools, such as Apromore.

  • Split Miner 2.0

    (by A. Augusto, M. Dumas and M. La Rosa)
    Split Miner 2.0 is an extended version of the Split Miner algorithm for discovering accurate and deadlock-free BPMN process models from event logs. With respect to the original (2017) version of Split Miner, the main improvements of Split Miner 2.0 are:

    • Ability to use both the start timestamp and the end timestamp of each activity, which allows it to identify concurrency more accurately if both timestamps are present in the log. The original split miner algorithm relied only on the end timestamps.

    • Ability to discover BPMN process models with inclusive decision gateways (OR-splits), which leads to process models with simpler branching structures. 

  • Staged Process Flow Performance Analyzer

    (by H. Nguyen, A.H.M. ter Hofstede, M. Dumas, M. La Rosa, F.M Maggi)
    Existing process mining techniques provide summary views of the overall process performance over a period of time, allowing analysts to identify bottlenecks and associated performance issues. However, these tools are not designed to help analysts understand how bottlenecks form and dissolve over time nor how the formation and dissolution of bottlenecks – and associated fluctuations in demand and capacity – affect the overall process performance. Staged Process Flow (SPF) is a ProM plugin offering a number of visualizations that collectively allow process performance evolution to be analyzed from multiple perspectives. The idea underlying this tool is an abstraction of a business process as a series of queues corresponding to stages. SPF has also been integrated as a plugin into Apromore.

  • Staged Process Miner

    (by H. Nguyen, A.H.M. ter Hofstede, M. Dumas, M. La Rosa, F.M Maggi)
    This is a standalone ProM distribution containing the Staged Process Miner plugin and the Stage-based Process Discovery plugins, as well as plugins for relevant baseline techniques. The Staged Process Miner plugin takes as input an event log in XES or MXML format and returns a partitioning of this log into stages (called “stage model”). The only parameter required is the minimum number of events for each stage. The Stage-based Process Discovery plugin takes as input the stage model and an event log and returns a process model (Petri Net and BPMN). The two baseline techniques for stage mining included in the package are the Divide and Conquer framework (DC) and the Performance Analysis with Simple Precedence Diagram (SPD). The four baseline techniques for stage-based process discovery included in the tool are Decomposed MinerRegion-based Miner (genet tool), Inductive Miner and FodinaThis ProM distribution also comes with a plugin to visualize the output of the above three techniques (SPM, SPD and DC).You can also download the SPM plugin directly from the ProM nightly build. The Staged Process Miner plugin has also been integrated as a plugin into Apromore.

  • Structured Miner 1.1

    (by A. Augusto, R. Conforti, M. Dumas, M. La Rosa, G. Bruno)
    Structured Miner is a tool for mining maximally structured process models in BPMN from an event log. The approach works in two phases. The first phase discovers the BPMN process model from an input log using a baseline discovery algorithm which does not force the discovered model to be structured (currently, Heuristics Miner and Fodina Miner are supported). The second phase structures the discovered model combining BPStruct and Extended Oulsnam Structurer (both used with default settings). The event log can be in MXML or XES format. The discovered model, in BPMN 2.0, can be opened and visualized using different tools, e.g. Apromore. Structured Miner is also part of BPMN Miner 2.0. The difference between the two is that Structured Miner always discovers flat process models whereas BPMN Miner 2.0 discovers hierarchical process models with subprocesses. Structured Miner has also been integrated into Apromore as part of the BPMN Miner plugin.

  • Timestamp Repair for Event Logs

    (by R. Conforti, M. La Rosa, A.H.M. ter Hofstede, A. Augusto)
    This tool allows the automatic correction of timestamp errors in business process execution logs. Precisely, given an input event log, it detects events recorded with the same timestamp and, first, it repairs the events order relying on correct event log traces (where the same events do not have recording errors), and then it computes the likely true timestamp for each event affected by same-timestamp errors. The tool receives in input the affected event log in the XES format and outputs the repaired event log.