Home Plotting Ciw's Café Example As A Sankey Diagram Using Plotly
Post
Cancel

Plotting Ciw's Café Example As A Sankey Diagram Using Plotly

In this post we’ll run the café example from the Ciw documentation, collect the results, and display them as a Sankey diagram.

First, we can run the example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import ciw
import pandas as pd

ciw.seed(2018)

N = ciw.create_network(
    arrival_distributions=[
        ciw.dists.Exponential(rate=0.3),
        ciw.dists.Exponential(rate=0.2),
        None,
    ],
    service_distributions=[
        ciw.dists.Exponential(rate=1.0),
        ciw.dists.Exponential(rate=0.4),
        ciw.dists.Exponential(rate=0.5),
    ],
    routing=[[0.0, 0.3, 0.7], [0.0, 0.0, 1.0], [0.0, 0.0, 0.0]],
    number_of_servers=[1, 2, 2],
)

Q = ciw.Simulation(N)

Q.simulate_until_max_time(200)

recs = pd.DataFrame(Q.get_all_records())

In order to prepare the results from Ciw for the Plotly Sankey diagramming class, we need to do some additional processing. I also just want to include the arrival node and the exit node just because I can. So we’ll need to extract how many times an arrival came into the system from the arrival node:

1
2
3
4
5
6
7
8
9
first_nodes = (
    recs.sort_values(by="arrival_date")
    .groupby("id_number")["node"]
    .apply(lambda x: x.iloc[0])
    .value_counts()
    .reset_index(name="flow")
    .rename(columns={"index": "destination"})
    .assign(node=0)
)

We’ll also need to compute the amount of flow to/from the other nodes. The reason why the arrival node is separate is because it isn’t included in the records table anyway.

1
2
3
4
5
6
7
recs = (
        recs
        .groupby(by=["node", "destination"])
        .size()
        .reset_index(name="flow")
        .replace({-1:4})
        )

Now we can combine these two information sources and get rid of the -1 for the exit node since we will want a proper index.

1
recs = pd.concat((first_nodes, recs))

Finally, let’s sort the values just for fun (as far as I know this doesn’t change anyway of substance).

1
recs = recs.sort_values(by=['destination', 'node', 'flow'])

Now we can prepare the Plotly figure as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from plotly.graph_objects import go

fig = go.Figure(
            go.Sankey(
                    arrangement='snap',
                    node=dict(
                        label=['ArrivalNode', 'ColdFood', 'HotFood', 'Till', 'Exit'],
                        pad=10
                        ),
                    link=dict(
                        arrowlen=15,
                        source=recs.node,
                        target=recs.destination,
                        value=recs.flow
                        )
                )
        )

Then we can finall change some background stuff and print the HTML which I have promptly pasted below into this post:

1
2
3
4
fig.layout.paper_bgcolor = 'rgba(0.5,0.5,0.5,0.5)'
fig.layout.plot_bgcolor = 'rgba(0.5,0.5,0.5,0.5)'

print(fig.to_html(full_html=False, include_plotlyjs='cdn'))

And there we have it. A Sankey diagram from Ciw’s results output.

I’ll be the first to admit that this diagram could use some work, but I am please at how this gives a prototype with little code. Primarily it is how jumbled up the arcs are going out of cold food. It may be possible to use different drawing layouts for x and y to obtain a nicer arrangement. The Networkx package has some layout functions that might supply such arrangments even if the core Networkx drawing utilities are not used.

Now, to me, the point of such a diagram isn’t to show a plan for a model, but rather to show what actually happened. it shows us how much of the customers went through which of the routing paths.

The above code needs some minor changes to be reusable, but it is definitely a process that can be generalized to work on othe simulation records from Ciw. The details of the model were not particularly important beyond what could be extracted from the results. The structure of the model was implicit in the results. Some parts of the structure could be missing when the simulation is either small or heavily constrained, but then again that might be desirable if you are interested in seeing actually flow rather than what you think you put into the model.

This post is licensed under CC BY 4.0 by the author.

A Generalization of Subindependence

A Sankey Random Walk with Matplotlib