Markov modeling via ordinal partitions: An alternative paradigm for network-based time-series analysis

Mapping time series to complex networks to analyze observables has recently become popular, both at the theoretical and the practitioner’s level. The intent is to use network metrics to characterize the dynamics of the underlying system. Applications cover a wide range of problems, from geoscientific measurements to biomedical data and financial time series. It has been observed that different dynamics can produce networks with distinct topological characteristics under a variety of time-series-to-network transforms that have been proposed in the literature. The direct connection, however, remains unclear. Here, we investigate a network transform based on computing statistics of ordinal permutations in short subsequences of the time series, the so-called ordinal partition network. We propose a Markovian framework that allows the interpretation of the network using ergodic-theoretic ideas and demonstrate, via numerical experiments on an ensemble of time series, that this viewpoint renders this technique especially well-suited to nonlinear chaotic signals. The aim is to test the mapping’s faithfulness as a representation of the dynamics and the extent to which it retains information from the input data. First, we show that generating networks by counting patterns of increasing length is essentially a mechanism for approximating the analog of the Perron-Frobenius operator in a topologically equivalent higher-dimensional space to the original state space. Then, we illustrate a connection between the connectivity patterns of the networks generated by this mapping and indicators of dynamics such as the hierarchy of unstable periodic orbits embedded within a chaotic attractor. The input is a scalar observable and any projection of a multidimensional flow suffices for reconstruction of the essential dynamics. Additionally, we create a detailed guide for parameter tuning. We argue that there is no optimal value of the pattern length m, rather it admits a scaling region akin to traditional embedding practice. In contrast, the embedding lag and overlap between successive patterns can be chosen exactly in an optimal way. Our analysis illustrates the potential of this transform as a complementary toolkit to traditional time-series methods.


I. INTRODUCTION
Data analysis methods that employ concepts and measures from network science [1][2][3][4][5][6][7][8] have been increasingly attracting the interest of nonlinear time-series-analysis practitioners over the past decade. Applications include human electrocardiograms (ECGs) [2,9,10], time series of earthquake epicentre locations [11], marine paleoclimate data [12], surface air temperature measurements [13], gas-liquid phase transitions [14], coastal hurricane activity [15], foreign exchange rates [16], lake sediment records [17], global equity index time series [18], and diode resonator circuit data [19]. The chief aim is the detection of dynamical characteristics given scalar time series by encoding information into an abstract structure and computing various statistics thereof. A complex network representation provides immense potential towards this end due to the variety of existing measures and their computational simplicity. Additionally, it provides a natural framework for defining the notion of connectivity between nonlinearly interacting components and studying characteristics of topological nature. The pertinent question, however, is whether network representations can indeed capture fundamentally different system dynamics and offer complementary insight to existing techniques.
Various studies have demonstrated a proof of concept by using networks to differentiate dynamical behavior and mapping features of time series to structures exhibiting distinct topological properties [20,21]. However, it remains unclear how the resulting networks relate to the underlying dynamics. Here, we investigate a mapping based on symbolic dynamics, in particular ordinal analysis [22][23][24][25], and attempt to provide a solid theoretical foundation to the class of ordinal network transforms [7,9,10,19,[26][27][28][29]. We reformulate the technique first introduced by Small [7] via a rigorous Markovian framework that allows the approximation of deterministic dynamics through examination of statistical properties of a stochastic model [30]. The key concept is the description of system-wide dynamics through the evolution of sets, rather than characterizing dynamical behavior at the level of trajectories. Traditionally this is achieved by using Ulam's method for discretizing the Perron-Frobenius (or transfer) operator [31,32]. Instead, we use a modified version defined on ordinal partitions.
Existing ordinal network formulations [7,9,10,19,[26][27][28][29] are not based on a rigorous framework, but heuristic approaches. The network tools employed so far are, therefore, not inspired by a theory, e.g., ergodic. Instead, simple measures used widely in hot areas of network science, e.g., number of nodes, mean number of connections (or its variance), average shortest path length, etc. have constituted the main tools of analysis. We have specifically opted for a rigorous approach to (a) be able to choose network metrics for the purposes of analysis in a targeted, informed manner, (b) explain characteristics of the network topology produced when encoding deterministic chaotic dynamics and (c) provide a link between ordinal network features and the partitioning of the original state space.
Our experiments illustrate the connection between fundamental dynamical features underlying the input time series and the structure of the network. We uncover a relationship between the hierarchy of unstable periodic orbits (UPOs) embedded within chaotic attractors and the cycles found in the network. An inverse mapping of network properties to the original dataset enables accurate localisation of low-order UPOs. With computational efficiency in mind, one of the reasons for promoting this framework is to set the stage for generating quantitative estimates of characteristic dynamical indicators, e.g., Lyapunov exponents [33] and topological entropy [34], using ergodic-theoretic ideas.
This work is also intended to be a guide to the practitioner, highlighting subtle aspects of parameter tuning and shedding light on sampling constraints. All other ordinal network studies recommend parameter selection either based on heuristic arguments or upon inspection of results, without rigorous reasoning or consistent selection among different systems. One of the reasons is that their focus is application-oriented, hence an ad hoc choice targeting a specified dataset in mind may be more suitable within such a scope. In contrast, the motivation behind this study is to assess the extent to which ordinal network representations remain faithful to the original dynamics. Our suggestions are the result of sensitivity analysis drawn from numerical experiments on a large variety of test systems, both discrete and continuous. It led to different conclusions with respect to appropriate selection; a main finding being that there is no optimal value for the pattern length, rather it admits a scaling region similar to the embedding dimension in traditional practice. Finally, we advocate using Poincaré maps in the case of continuous flows as constructed networks are less sensitive and more effective for detecting UPOs or estimating invariants [33,34]. In summary, our findings comprise evidence that the proposed mapping can be used to augment traditional analysis toolkits.

II. TIME-SERIES-TO-NETWORK TRANSFORMS
Zhang and Small [2] first utilized complex networks to examine properties of various time series. Mapping pseudoperiodic cycles to nodes on a network, with connections assigned by a similarity criterion, they used statistical measures to examine the topology of the network. Their analysis showed a distinction in topological structure between chaotic and random signals. Several different network transforms have been proposed [1,[3][4][5][6][7]14], each best equipped to capture certain aspects of dynamical behavior-and hence more appropriate for applying to certain systems-while less effective in capturing other aspects [20]. The various methods differ primarily on the nature of the entity for which nodes encode, and therefore the type of information represented by the network: (a) topological characteristics of state space [3,6,35], (b) temporal information [1,7], or (c) geometrical properties of the time series [4]. Network transforms may be broadly classified into three categories [20].
In proximity networks nodes correspond to segments in state space. Connectivity is defined by considering mutual proximity of different segments. The most commonly employed method in this class is the recurrence network paradigm [6]. It has proved effective for real-world data [12,17] and, more recently, some theoretical links between classical nonlinear dynamics tools and the network structure have been presented [36]. Inversion of the transform has been proved for a special class of appropriately weightedrecurrence networks under certain conditions on the given the time series [37]-specifically, a uniform distribution of points on the attractor-as well as for equivalent (all other aspects being equal) k-nearest neighbor networks [38]. In visibility graphs nodes represent single time series points and connectivity is defined by considerations of convexity constraints [4,18,39]. Transition networks comprise formations whereby nodes represent coarse-grained dynamical states and connectivity is defined by temporal succession [1,7,9,19,26,40]. While the choice of network transform is critical, and each method can shed light on different characteristics of dynamical behavior, the concept of recurrence is underlying the majority of existing methods [20]. This refers to a system's propensity to revisit certain dynamical states over and over again as time progresses. In mathematical terms, this renders measure-preserving dynamical systems well suited to this type of analysis due to the Poincaré recurrence theorem.

A. Transition networks
The working hypothesis behind almost the entire spectrum of nonlinear time-series-analysis techniques is the existence of a deterministic component in time series, possibly hidden by noise [41]. The framework of a state space may then be established and, under certain technical conditions (endowment with a Borel algebra and a measure-preserving evolution rule), one can define the natural invariant measure, i.e., the probability distribution of all possible states [42]. In practice, most systems are intuitively thought of as evolving continuously in time and capable of obtaining any value from within an interval (continuous state space). This leads to the necessity for discretization, which can be defined by any arbitrary partition of state space [43]. One option is a regular-lattice partition into quantiles or equal-size bins [44,45], i.e., the familiar histogram at a specified resolution, which constitutes the most commonly employed choice.
Dynamical states are then labeled in a discrete form by each bin. Transition networks are formed by estimation of transitional probabilities from one state to another by means of frequency statistics computed from time series [1,40,46]. Therefore, in stark contrast to proximity-based approaches, transition networks explicitly involve the temporal order of observations and, thereby, preserve (to a certain extent) causal relationships contained in the dynamics within connectivity patterns. At the same time, a common drawback of transition networks is that amplitude information, especially finer aspects such as small fluctuations, is discarded. This class of approaches is, consequently, better suited for identifying state space regions that are central to the causal evolution mechanism of the system, e.g., high mixing, stretching and transport of lobe dynamics. Nicolis et al. [1], for instance, uncovered a link between local stability properties, as well as global indicators, of the underlying dynamics and the incoming and outgoing connections in the network. Padberg et al. [40] followed a different direction using set-oriented methods to identify global transport barriers in continuous flows.
The focal point of this study is a methodology which belongs to the class of transition networks. The foundation is the notion of an ordinal partition of state space, dynamics on which can easily be obtained by counting and categorizing short time-series segments. Ordinal partition networks (OPNs) [7] are a model of the coarse-grained dynamics in that connectivity between nodes is defined via temporal succession of symbols that label each category. Sun et al. [26] extended the original transform by including amplitude information in the symbolic representation and allowing directed (forwardtime causality only) and weighted edges (frequency count of occurring transitions between a pair of symbols). McCullough et al. [19] considered a generalized version of this formulation by including a time lag parameter into the network formation algorithm and applied the technique to diode resonator data. Masoller et al. [27] applied this OPN variant with a single-step lag to one-dimensional discrete maps and semiconductor laser intensity data to capture changes in a time series. Similarly, Kulp et al. [10] applied it to ECG data from patients with a variety of heart conditions. McCullough et al. [9] generalized the framework even further by varying the lag to conduct multiscale analysis on ECG data. Zhang et al. [28] augmented the OPN construction algorithm to map multivariate time series to a network. Guo et al. [29] proposed a different scheme for multivariate time series whereby cross and joint OPNs are computed. Input series are drawn from two coupled systems, where synchronization is often present.
All existing OPN variants share three characteristics. First, the methodology itself, as well as the resulting networks, were not explored in a rigorous fashion. No unifying framework relating network topology to the original dynamics has been proposed so far. Hence, no clear way of estimating dynamical invariants exists due to the lack of a mathematical foundation. Second, in all these studies metrics that belong to two groups were employed to analyze a specific dataset at hand. They comprise either (i) common network science tools (e.g., mean and variance of number of connections, characteristic path length etc.; see Refs. [10,19,26]) or (ii) information-theoretic measures (some form of Shannon entropy on empirical distributions of nodal properties; see Refs. [9,[27][28][29]). Finally, an ad hoc selection of parameters is suggested based either on (i) heuristic arguments or (ii) inspection of results, without providing rigorous reasoning or consistent selection among different systems. We propose a framework that (a) allows a natural interpretation of the network as an approximation to the dynamics, (b) provides an informed, targeted means of choosing useful metrics, (c) enables use of ergodic-theoretic tools and (d) suggests parameter selection according to sensitivity analysis of the most general OPN variant for univariate series applied to a gamut of discrete and continuous systems.

B. Ordinal symbols
Symbolic dynamics is an incredibly powerful tool in the study of dynamical systems. The core idea is the discretization of state space into a finite number of components, each labeled by a different symbol drawn from a finite set, an alphabet. Dynamics of infinitely long symbolic sequences, e.g., from an alphabet such as the binary set {0,1}, can-under certain conditions-be topologically conjugate to the original dynamics [47]. Ordinal symbolization considers a linearly ordered state-space partition based on patterns which encode the relative amplitude of observed values in short consecutive time series segments. Each symbol reflects the rank ordering of sample points within each segment. By recording the relative frequency of occurrence of all symbols, an empirical probability distribution is constructed. The Shannon entropy of this distribution, permutation entropy (PE), quantifies the diversity of possible orderings [22]-referred to as the set of admissible ordinal patterns. Given finite data, a practitioner only has access to a sample of this set, the patterns which are in fact realised in the given time series. In the limit of infinitely large pattern length, Bandt et al. [23] proved analytically that PE is equal to the Kolmogorov-Sinai (KS) entropy for piecewise monotonic self-interval maps in R, and a similar statement holds for the topological entropy. Amigó et al. [24] later generalized this result to discrete deterministic systems defined by ergodic maps on intervals in R n . This result relies on a modified concept of PE derived from a finite-state stationary stochastic process equipped with an arbitrary order. A different viewpoint was introduced by the standardised approach due to Keller and Sinn [48], whereby multidimensional ordinal patterns, inspired by the original definition, are defined.
Ordinal analysis rapidly attracted the interest of practitioners, especially among investigations of complex biomedical signals, due its computational efficiency. Contributions in the field include several studies on electroencephalograph (EEG) measurements using PE as a criterion for distinction between different brain physiologies; qualitative changes related to epileptic activity in multidimensional time series obtained from different channels [25], detection of the onset of epileptic seizures [49], differentiation of brain states using ordinal pattern distributions [50], detection of deterministic dynamics in epileptic seizures [51], to name a few. Cardiac interbeat-interval biosignals [52,53], climate time series [54], and financial time series [55] have also been analyzed in the same fashion.
Apart from PE, the other main tool employed in applied ordinal analysis is the count of forbidden patterns (CFP). Deterministic rules dictate certain constraints in the set of ordinal patterns which can be realised. In particular, such systems are characterized by the existence of a set of forbidden patterns [56] that cannot occur in any time series realisations thereof. This set can be thought of as the collection of regions in state space where the trajectory is not allowed to transverse. It is an example of pruning, a rule which forbids the occurrence of certain subsequences in the symbolic dynamics. However, random systems can admit every possible pattern given sufficiently long time series. This stems from the fact that sampling a stochastic process will generate identically and independently distributed (i.i.d.) values. Mapping the sample data set to a symbolic alphabet of finite size results in a sequence of i.i.d. permutations. The probability that any ordinal pattern is missing approaches zero as the total number of data points grows. This pruning rule, i.e., CFP, has been utilized as a criterion for the detection of a deterministic component in noisy time series data [57].
In the current study we investigate the properties of the graphs associated with the Markovian formulation produced by alphabets which are composed of ordinal symbols of increasing length. We argue that counting ordinal patterns from a time series is essentially equivalent to defining a partition in a higher-dimensional state space wherein the underlying dynamics have been embedded. An adapted form of Ulam's method is then used to provide a definition for ordinal networks computed from time series generated by low-dimensional chaotic ergodic systems (see Fig. 1 for a schematic of the transform). We examine the extent to which this network-based stochastic approximation remains faithful to the true deterministic dynamics.
The outline of the paper is as follows. In Sec. III we present the method. In particular, we show how to extract a symbolic sequence from a time series (Sec. III A), define the notion of an ordinal partition (Sec. III B), introduce the Markovian framework (Sec. IV), redefine the construction of an OPN (Sec. IV A), and last, we discuss ergodicity (Sec. IV B) and irreducibility (Sec. IV C). In Sec. V we present the results of our investigations. Specifically, we first explore the representation of simple dynamics as a proof of concept via the transformation of periodic series (Sec. V A). Sec. V B is devoted to the selection of methodological parameters, as well as to practical considerations presented by real-world data, e.g., finiticity. In Sec. V C we examine the network structures obtained by the numerical experiments conducted on various chaotic discrete-time dynamical systems in one and two dimensions. Furthermore, we adapt the application of OPN transforms to data generated by three-dimensional chaotic continuous-time flows by means of Poincaré maps. Finally, in Sec. V D we illustrate the connection between cycles in OPNs and the dominant (lowest-order) UPOs embedded within a chaotic attractor. This result is used to successfully detect UPOs of order 4 in the Lorenz attractor. Conclusions are discussed in Sec. VI.

III. COUNTING ORDINAL PATTERNS
We consider deterministic dynamical systems defined by the pair (M, φ), where the single-valued C r function φ : M → M represents the action of a rule for temporal evolution and the compact manifold M specifies a state space for the system. Points within M, say x n ∈ M ⊆ R d for some dimension d ∈ N, uniquely represent the state of the system at some instant in time. Time is prescribed either in terms of advances in discrete time steps n ∈ Z (iterated maps) or as progression in continuous time t ∈ R, i.e., continuous-time flows generated as solutions to ordinary differential equations. We focus on the former type and examine the latter through the lens of Poincaré maps. Counting patterns via a Poincaré map associated with a flow instead of the numerical approximation obtained by sampling the system at discrete time instants is unorthodox within ordinal symbolization, but 062307-4 offers the benefit of not having to select the sampling rate (and embedding lag, see Sec. V B 1) parameters.
The equation which defines a discrete-time system of d state variables is given the by the recurrence relation of general form where f : R d → R d is a nonlinear vector field which describes the action of the dynamics on states in d-dimensional space, i.e., points defined by the ordered d-tuple x = (x 1 , x 2 , . . . , x d ) and a set of parameters θ. Temporal evolution is described in terms of forward-time iterates from a specified initial state given by functional composition, i.e., the flow which specifies the solution to Eq. (1) may be represented by for some initial state x 0 . A collection of states specified by the solution of Eq. (1) for some initial state x 0 = x(0) is called a trajectory. The set of all state-space points which lie in the 'past' or 'future' of a specified x 0 is referred to as the orbit through x 0 . It is given by the bi-infinite (or infinite for noninvertible maps) sequence where f n (x) denotes the pre-image of the n th iterate of f if n is negative. Ideally, one wishes to possess full knowledge of the temporal evolution of every possible initial state to properly understand a dynamical system. In practice, if the equations of motion are known, then one resorts to numerical simulation and sampling trajectories from several initial states. Experimentally, however, it is often the case that one is in possession of only a single long orbit. In addition, there is often no direct access to the state variables themselves as the time series may comprise values of a measurement function applied to the state of the system.

A. Symbolic itinerary
Let {x * n } N n=1 denote an arbitrary scalar time series of N temporally ordered measurements of the state of a dynamical system. We require that these values are drawn from a linearly ordered set. Our main assumption is that the observable, i.e., the sequence x * n , represents the trajectory of a deterministic system. It may correspond to one of the state variables {x j } j∈{1,...,d} or to a measurement function of the full state of the system g(x). For simplicity we drop the asterisk hereinafter.
The ordinal symbolization technique [22] consists of a coarse-grained segmentation of the dataset into elements (windows) of equal length m, i.e., akin to embedding the time series to m-dimensional space using a time delay τ . Successive windows are not overlapping at w = w(m) ∈ {1, m} points. Thus, the maximum number of windows occurs in the case of maximal overlap at m − 1 points (w = 1) and is equal to N − m + 1. However, the minimum number occurs in the nonoverlapping case (w = m) where the total number of windows equals N m . To illustrate the difference between parameters τ and w, consider the first two segments obtained by setting m = 3, τ = 2 and (A) w = 1 or (B) w = 4. Time lag τ dictates the relation between points within a segment, hence the first two segments are given by (x 1 , x 3 , x 5 ) and (x 2 , x 4 , x 6 ) in scenario A. Slide lag w dictates the relation between successive segments, in particular the extent to which they do not overlap. Hence, the first two segments in scenario B are given by (x 1 , x 3 , x 5 ) and (x 5 , x 7 , x 9 ). The first component of consecutive segments under scheme A is generated by the sequence x 1 , x 2 , x 3 , . . . due to w = 1, whereas the corresponding sequence under scheme B is x 1 , x 5 , x 9 , ... due to w = 4.
Denote the resulting collection of segments (or embedding vectors) by an indexed sequence of m-tuples of the form preserving the temporal order of windows. Each window is mapped to a symbol drawn from a finite alphabet. The set of permutations of the natural numbers {1, 2, ..., m}, over which the symmetric group S m acts, will serve as the alphabet of choice to create a symbolic ordering for each window which reflects the ordinal relations between points within. Ordinal symbols can be thought of either as the set of re-orderings of {1, 2, ..., m} or as the group elements of S m acting on the identity element through the symmetric group's operation, i.e., composition of permutations; the original choice of the identity element as a reference frame is arbitrary and any member of S m comprises a valid alternative. The mapping may be defined in two equivalent ways, namely chronological index ranking introduced by Bandt and Pompe [22] and amplitude ranking introduced by Small [7]. According to the former, shifted time indices within a window are stored in a vector indexed by the corresponding amplitude rank in ascending order. For example, consider one of the windows obtained from segmentation into windows of length m = 4 with τ = 1, w = 1, say the seventh x (4) 7 = (x 7 , x 8 , x 9 , x 10 ), whereby relative magnitude relations are given by By shifting indices {7, 8, 9, 10} to the set {1, 2, 3, 4}, the associated permutation is (4,2,1,3), or in cyclic notation (143)(2). This is due to the fact that the first point of this segment chronologically (x 7 ) is the second largest value in the group and so assumes the third position (due to order being ascending). Similarly, the second chronological value (x 8 ) is the third largest and so placed in the second position, x 9 (third) is the largest and so assumes the fourth position, etc. Thus, according to this mapping, time indices of sample points are sorted.
However, according to amplitude ranking, points within a window are ranked in terms of relative amplitude in descending order, i.e., time indices remain fixed and each sample point is simply assigned a rank. For instance, the 4-tuple (x 7 , x 8 , x 9 , x 10 ) is assigned the permutation (2,3,1,4), or (123) (4), since x 7 is the second largest member of the group, x 8 is the third largest, x 9 is the largest, etc. The sketch in the top panel of Fig. 2 illustrates the difference between the two for the case m = 3. It can be easily shown that both formulations are equivalent by defining a bijective map between the two. In the case m = 2, this rank is equivalent to the traditional setting in symbolic dynamics, i.e., mapping of trajectories to infinite sequences of elements in the binary set {0, 1}.
Formally, an ordinal pattern of length m for the window labeled by x (m) i π x (m) i = (π 1 , π 2 , ..., π j , ..., π m ), π j ∈ {1, ..., m} may be defined as the permutation π ∈ S m , which arranges the points within according to their order, i.e., x π 1 < x π 2 < · · · < x π m (chronological index rank), or as the transformation π : R m → S m such that x i+ j−1 is the π j th largest point in x (m) i (amplitude rank). In the event that two elements of x (m) i are equal, we arbitrarily pick the one that occurs first chronologically as the smallest [22].
By mapping every window to an ordinal pattern, one obtains a symbol sequence corresponding to a given time series uniquely. We represent the ordinal symbolic itinerary of time series {x n } N n=1 by The empirical probability distribution of ordinal patterns is defined via their relative frequency of occurrence within s n , Prob[π; s n ] = P [π] = |{n|s n = π}| where | · | denotes cardinality of a set and "|" means "such that." The number of admissible (distinct occurring) patterns is Since the cardinality of S m is m!, one can evaluate the relative count of admissible patterns computed from a time series over the set of all possible, P (m) a (x n ), and the corresponding proportion in the complementary set of forbidden ordinal patterns (patterns which cannot be realised in a time series due to deterministic constraints [56]). The relative count of forbidden patterns equals In practice, P (m) a , P (m) f are both estimators of the true numbers in the case of N < ∞, i.e., when we are in possession of a trajectory but not the full orbit. Note that the absence of an m-pattern pervades all longer patterns for larger m in the form of outgrowth forbidden patterns whose growth is superexponential in contrast to admissible m-patterns (exponential growth with m). The semilogarithmic plot depicted in Fig. 3 illustrates the rate of decrease of the proportion of admissible patterns computed from chaotic time series (N = 10 6 ) of the logistic (black circles) and cubic (red squares) maps. Note that the rate is faster than exponential. This has been theoretically described for discrete maps [56]. In addition, the discrepancy in the rate corresponding to each map is due to the inherently different dynamics. Since both maps are strongly mixing, the larger state space [−1, 1] is expected to produce a greater diversity of ordinal patterns for finite m. For m 23, the two curves coincide due to undersampling effects.

B. Ordinal partition
In the previous section, we outlined the details of the generalized BP methodology for ordinal symbolization of a given time series. Here we describe the fundamental construction for the modeling process we propose in this study. To speak of Markovian descriptions of deterministic dynamics, we shall restrict our attention to even more specific families of systems.

062307-6
The characteristic shared by all systems under study is chaotic behavior, i.e., (I) sensitivity to initial conditions, (II) topological mixing, and (III) the existence of dense periodic orbits (Ch. 1, Sec. 1.8, Def. 8.5 [58]). In addition, the evolution rule φ is required to be a measurable transformation. Endow M with a σ -algebra M and associate with an invariant probability measure The probability space denoted by the triplet (M, M , ν) is also equipped with a filtration {F n 0 }, i.e., an increasing sequence of σ -algebras on the measurable space M such that n 1 n 2 ⇒ F n 1 ⊆ F n 2 . Since ν is φ-invariant, we speak of a measure-preserving dynamical system, i.e., one which obeys a 'conservation of mass' principle with respect to the action of the evolution operator φ. Infinitely many invariant measures are typically admitted by deterministic chaotic dynamics. Experimentalists often wish that ν corresponds to the natural invariant measure, i.e., the distribution of state space mass over a typical orbit of the system. Therefore, our focus is restricted to examination of ergodic systems [59]. The measure ν satisfies for any continuous function f : M → R and Lebesguealmost all initial conditions x ∈ M.
Given that the natural measure of the system is unknown, one resorts to simpler alternatives to compute an estimate thereof from x n . This inadvertently leads to the consideration of different types of finite state-space partitions (Sec. 6.1 [30]). In practice, the best alternative for the estimator of ν, which we denote byν, reduces to essentially counting symbols using one of several alternatives for a state-space partition [30,32,44]. This finite set of symbols represents a partition of state space into a finite number of connected regions of nonempty interior. Usually the elements of the partition take the form of regular-grid cells [32] whose collection covers M. Dynamics occurring within each element are completely ignored, while the model captures coarse-grained dynamics between elements. It is this classical framework that we adopt, but by adapting it to our setting. In place ofν we choose the normalized Lebesgue measure.
Since ordinal symbols constitute our choice, the first step of our proposed procedure amounts to a mapping of M to a higher-dimensional space R m with m > d via a timedelay embedding of a single point lag [60]. Provided certain smoothness and genericity conditions on φ which are fulfilled by the family of systems defined above, a sufficiently large m-specifically larger than twice the fractal (box-counting) dimension D 0 of manifold M [61]-ensures the existence of a diffeomorphism between the original and reconstructed state spaces. The image state space M IMG , and not the original manifold M, is the space which will be partitioned into a finite number of elements. Therefore, topological equivalence between original and embedded dynamics is ensured via counting patterns of length larger than twice the fractal dimension of the underlying dynamics.
which must always be true for some permutation label π = (π 1 , π 2 , . . . , π m ) ∈ S m as we are only concerned with linearly ordered sets. Moreover, this mapping is obviously unique. The lack of strict inequalities connotes the resolution of ranking ties via chronological order. Intuitively, each element of an ordinal partition is spatially defined by a region whose boundaries are formed by the x i = x j cross-sectional hyperplanes of Euclidean space R m (x i denotes the i th coordinate). For instance, in R 2 the sole boundary is provided by the identity line x = y as Fig. 4 demonstrates for the chaotic logistic trajectory. In R 3 these are the x = y, x = z, and y = z planes as depicted in the partition of Fig. 5. Notice that the identity line x = y = z is a shared boundary between all 3! = 6 regions (different color in Fig. 5). In R 4 the six three-dimensional spaces x i = x j where i, j = 1, . . . , 4 segment the full space into 4! = 24 regions with the identity line (1, 1, 1, 1) T again being a common boundary, a fact which holds in general regardless of the value of m. Therefore, each of the m! regions in R m may also be prescribed by a hyper-arc denoting a different angular location of points with respect to the identity line 1 = (1, 1, . . . , 1) T ∈ R m . The pattern length dictates both the resolution of the partition (uniform size 1/m!) and its dimensionality. Upon increasing the resolution, data is mapped to increasingly higher-dimensional spaces irrespective of the original dimensionality. This is not a partition in the strict sense as we are not faced with pair-wise disjoint elements due to shared boundaries. To overcome this difficulty in practice, Bandt and Pompe [22] suggested using chronological order as a criterion for sorting ranking ties. This is equivalent to sending every observed point on the boundary into the interior of one of the two adjacent elements. However, in rigorous terms, the covering has the following two properties: where Int denotes the interior of a set, excluding boundary points. The set of elements of the partition Q, labeled by permutations of natural numbers, will serve as the symbolic alphabet of an associated Markov model for the evolution rule φ once our symbolic dynamics are defined as follows.
Given a finite time series such as x n , we possess a temporally ordered trajectory that is a subset of the true dynamical orbit O(x 0 ). Embedding with a sufficiently high m and τ = 1 allows us to obtain a subset of a topologically equivalent orbit M IMG , designated by the sequence whereby φ * topologically conjugate to φ. Since M IMG is fully covered by partition Q, which is guaranteed by property (I), every point x (m) ∈ Q π for some permutation π. Similarly for all its iterates φ n * (x (m) ). Consider a sequence comprising of labels of Q n ∈ Q which is defined via the orbit O(x (m) 0 ), i.e., a bi-infinite sequence which represents the analogous symbolic orbit. The finite symbolic itinerary (abuse of notation since we only refer to the finite itinerary s n hereinafter) s n defined in Eq. (4) constitutes a time-contiguous sample drawn from this sequence. Using a decimal point, we can separate s i<0 from s i 0 in the form . . . s −2 s −1 .s 0 s 1 s 2 s 3 s 4 . . . and speak of current state, future and past. This construct suffices for delineation of shifts (and subshifts) of finite type and the associated symbolic dynamics.

IV. MARKOVIAN FRAMEWORK
Instead of following the direction of topological conjugacy and studying subshifts of finite type, as in Ref. [62] within the Amigó paradigm, we opt for a Markovian description. We define discrete-time stochastic processes of countable state space for increasing m on the set of permutations by means of the counting measure and information extracted from a sample trajectory. The procedure is defined via the counting measure applied to the symbolic itinerary s n and the associated σ -algebra Q of all subsets of the symbol space. In particular, we record pairs of consecutive ordinal patterns drawn from the ordered product space S m × S m . They represent forward-time transitions between admissible patterns, or alternatively a single action of φ * on elements of the partition Q. Given a consecutive pair of ordinal patterns in the symbolic itinerary, say . . . , s n = π, s n+1 = ξ, . . . denoting the corresponding partition elements Q π , Q ξ , we can compute μ(Q π ∩ φ −1 * Q ξ )-the number of times the itinerary is in state Q π and the φ * -preimage of Q ξ simultaneously. This quantity represents the proportion of mass moving from region Q π to region Q ξ in state space under a single action of the dynamics. Division by μ(Q π ), the normalized version of which is the probability measure on the space of ordinal patterns as defined in Eq. (5), leads to the required Markovian transition probabilities. Given probability space (Q, Q , μ) along with filtration {F n }, we define the S m -valued stochastic process S = (S n , n ∈ N ) which possesses the Markov property, i.e., ∀π ∈ S m and n 1 , n 2 with n 1 < n 2 , if the input time series comprises a measurement function of a signal generated by a deterministic source. The most fundamental question is that of convergence as m → ∞ since the number of admissible patterns grows exponentially with m [56]. However, the Shannon-McMillan-Breiman theorem asserts that these entropies do not diverge as m becomes large provided the source is a discrete-time ergodic stationary process [63]. That is, the ordinal Markov chain defined in Eq. (12) is characterized by the asymptotic equipartition property.
Note that, strictly speaking, S obeys the Markov property given by Eq. (12) only if w m; in the case of inequality, some information loss occurs since there are time series points that are not encoded. This constitutes a nonoverlapping scheme. If w < m, then successive patterns overlap at m − w points, with w = 1 producing the maximally overlapping scheme. Hence, successor patterns are conditioned on their predecessor, i.e., each new pattern contains part of the previous pattern and becomes part of the following pattern. Therefore, the constraint imposed on the set of admissible ordinal transitions leads to Markov chain S having a finite memory. This setup can be treated as a Markov chain of order m − w, with w m − 1 finite. The future state depends on the past m − w states.

A. Ordinal networks
Ordinal symbolization so as to obtain itinerary s n together with Markov chain S allow the association of a given time 062307-8 series with a directed graph, as is common in the study of Markov chains of countable state (Ch. 1 [64], Sec. 1.4, 1.7 [65]). This is what we term an ordinal network. It is constructed as follows. The set of observed admissible patterns is mapped onto the set of nodes V = {1, 2, 3, . . . ,V } of a graph, say G(V, E ). The total number of nodes (assumed to be nonisolated by construction) equals the network size |V| = V (m), is equal to N (s n , m). Directed edges are assigned to pairs of patterns occurring in temporal succession, i.e., the set of links (or edges) E corresponds one-to-one to admissible forward-time transitions between regions of the ordinal partition Q as observed in the given time series.
Formally, bijective mappings κ : S m → V and ψ : S m × S m → E are applied to the set of admissible symbols (and the product set of ordered pairs of symbols) of itinerary s n to procure an OPN topology. Nonelementary events in the context of the previously defined ordinal stochastic process of order m, say of the form where π, ξ ∈ S m are two ordinal symbols which appear successively within the symbolic itinerary. They, respectively, form the labels of partition elements Q π , Q ξ and nodes indexed by i, j ∈ V. Thus, we can revert to the familiar i, j notation following this, which facilitates comparison of our formulas to the existing literature on ergodic theory.
The elements of the adjacency matrix form a directed multigraph and are simply given by the frequencies where μ denotes the Lebesgue measure given infinitely long time series. In practice, it corresponds to the counting measure. Directed multigraphs comprise a natural representation for a random walk realised according to the rules specified by finite-state time-homogeneous Markov chains such as S.
The associated transition graph dictates that edges are normalized by the number of nearest (outgoing) neighbors of each node; i.e., its elements are identical to the transition probabilities of chain S. Then a right-stochastic matrix, T = (T i j ), is produced with rows summing up to unity. This corresponds to the weighted state-transition matrix of the Markov chain. It is defined by the conditional probability We interpret entry T i j as the probability that a typical point in region Q i of embedding space moves into Q j under one iteration of the evolution operator φ * . The binary formulation A = (A i j ) given by where 1 (0,∞) (x) denotes the indicator function on an open semi-infinite interval, is the corresponding connectivity matrix. It constitutes an unweighted directed graph with A i j = 1 if a forward-time transition between the corresponding ordinal symbols is observed in s n , otherwise it is equal to 0. We shall show that matrix A captures important topological information about the underlying dynamics behind the input time series.
To summarize, constructing OPNs from a specified time series comprises a multistage process. First, a nonlinear transformation to a different set of coordinates in a higherdimensional space is applied. Second, this new state space is partitioned into a finite collection of disjoint sets by means of location of points relative to the identity line x = y = z = . . . . Third, a subsequent mapping to a symbol space is performed. Fourth, this allows the formulation of a discrete-time finite-valued Markov chain. Directed and weighted networks correspond to the natural transition graph representation thereof (see Fig. 6 for example networks representing periodic and chaotic dynamics). All relevant information is extracted via application of the counting measure to the symbolic itinerary s n .

B. Ergodicity
Ergodic theory is concerned with the statistical properties of deterministic dynamical systems which admit an invariant measure. In abstraction, the field concerns qualitative properties of the action of a group on a measure space. This constitutes a departure from the Lagrangian perspective of tracking the dynamics of an individual point and an adaptation of a viewpoint of the motion of collections of particles (or densities). Intuitively, one considers the evolution of a "mass 062307-9 of particles" in state space; the aim being the study of the temporal change in the 'size' of a specified set of particles. The appropriate mathematical framework for general M ⊆ R d is via the Borel σ -algebra (smallest subset of the set of all measurable subsets of R d ), whereby the notion of "size" may be quantified in a meaningful sense by means of the Lebesgue measure.
Dynamical systems may be classified depending on whether (i) mass or (ii) volume is preserved in state space under the dynamics. In the former case, the ergodicity property asserts that, under certain conditions, the time average of a function along a single trajectory exists almost everywhere and is related to the spatial average. This property, which formalises the notion of typicality, is guaranteed by strong analytical results such as the ergodic theorems by Birkhoff [59] and Khinchin [66]. In the latter class, the distinction is between conservative (area/volume/Lebesgue measure-preserving) dynamical systems, dissipative (volume-contracting) systems and expanding maps (volume-generating). Note that the latter two are still measure preserving, but not of the Lebesgue measure. All three such types of systems are amenable to a Markovian approach.
A major advantage of following the Markov modeling paradigm is the ability to overcome all the drawbacks associated with simulating (or examining) a long orbit of a dynamical system, which comprises the traditional approach [30]. Namely, that orbits may display nonequilibrium behavior for a long time before settling into an asymptotic mode and computer round-off errors or limited experimental precision. Rather than compounding a sequence of iterations, one can treat a time series generated by an ergodic system as a collection of single iterates of the governing dynamical law via a stochastic approximation provided by the Markov model.
The framework we propose for interpreting OPNs is essentially an adaptation of the work by Froyland [31] and Dellnitz and Junge [32] on the discretization of the Perron-Frobenius operator via Ulam's method, but in embedding space rather than the original state space. In one-dimensional systems, Li [67] and Boyarsky [68] first demonstrated how powerful this approach can prove. Froyland [69] extended the former's analytical results to random interval maps. The main idea lies on the fact that the statistical properties of interest can be rendered stable by the addition of a small amount of noise [30]. Thus, instead of analyzing the deterministic dynamics directly, one may wish to slightly perturb and create a Markov model. The details of the method for approximating a natural measure in this fashion have been outlined in Ref. [31].

C. Strongly connected graph and irreducibility
The pertinent questions for any Markov chain are the existence and uniqueness of a stationary distribution, which reflects the long-term average occupation of each state if the underlying process is ergodic. Existence is generally guaranteed by finiticity and time-homogeneity (Sec. 1.5, Ch. 4, Convergence Theorem 4.9 [65]) of the chain in question, while uniqueness is slightly trickier. The property of irreducibility is a facet of OPNs which can be instantly checked after symbolization. Topological transitivity is expected to lead to irreducible Markov chains, however finiticity can alter the produced network in the case of undersampling. In the case of lack of irreducibility due to finiticity, we propose a quick and benign way of modifying the network structure which ensures this property. First, we proceed with the necessary definitions.
State (node) j is said to be accessible from state i if P (S n i j = j|S 0 = i) > 0 for some n i j ∈ N ∪ {0}, i.e., there is a nonzero probability that a system realisation (or alternatively, a random walk on the transition graph T ) starting from node i will reach node j at some point in the future (dependent on both i and j, hence the index). Transition i j is represented by a self-loop in T if n i j = 0. A Markov chain is irreducible if all states are accessible from all others. Formally, Furthermore, a state i has period k 1 if recurrent trajectories must occur at integer multiples of k, i.e., k = GCD{n ∈ N|P (S n = i|S 0 = i) > 0}. The acronym GCD stands for greatest common divisor. The chain is aperiodic if k = 1 ∀i ∈ V. These two notions are essential for the facts we establish as follows.
First of all, if a finite-state Markov chain is irreducible and aperiodic, there exists a unique stationary distribution p = (p 1 , p 2 , . . . , p V ) which satisfies the axioms of probability (nonnegativity, i p i = 1 and countable additivity) and such that pT = p. (18) This is the positive invariant measure of the Markov chain. This statement and convergence to equilibrium are wellknown facts, see, e.g., Thm. 1.7.2, Thm. 1.7.7, Thm. 1.8.3 in Norris [64]. Eq. (18) is identical to the left-eigenvector equation of T with eigenvalue equal to unity. The appropriate multiple in this eigendirection corresponding to p is obtained by normalizing the leading left eigenvector. (Since the Perron-Frobenius theorem dictates that the unit eigenvalue is the largest for a stochastic matrix, it is simple, and so the corresponding eigenspace is one-dimensional and all elements of the leading eigenvector are positive.) Second, we are also concerned with the property of ergodicity. The celebrated ergodic theorem due to Birkhoff [59] was initially proved within the framework of dynamical systems on smooth manifolds, but was later recast in terms of a general information source by Khinchin [66]. It is this form we are interested in since we would like to ensure that the ordinal Markov chain S has a representative limiting behavior in terms of a time-average from a long trajectory. Fortunately, irreducibility suffices to prove this fact (Thm. 1.10.2 [64]).
Third, irreducibility is manifested in the form of a strongly connected transition graph, i.e., a path exists from every node to any other. We remark that an OPN will be irreducible by virtue of construction since a node is connected to the nodes corresponding to the predecessor and successor symbols within the itinerary s n , except in the following two problematic cases: (a) the node corresponding to s 1 has no incoming links, or (b) the node corresponding to s N−m w +1 has no outgoing links. This event becomes more likely with 062307-10 increasing m, however it is quite rare judging from numerous numerical simulations and never occurs unless the symbolic itinerary or the OPN is severely undersampled for specified m. In addition, we propose the simple fix of removing the first or last value of a time series, repeating the irreducibility check (via strong component detection algorithms) and iterating this procedure if necessary.
Finally, we remark that it can be easily shown that the relative (over all nodes) out-strength distribution of the directed multigraph W and the stationary distribution of the ordinal Markov chain S are identical. Denote by k out W (i) the out-strength of node i ∈ V, i.e., the total number of adjacent outgoing links This term is equivalent to the weighted out-degree of a node. By "out-degree" we refer to the (binary) connectivity graph represented by A. It corresponds to the number of distinct outgoing neighbors, i.e., the total number of successor patterns. Note that by nature of construction of an ordinal multigraph, every node has the same number of incoming and outgoing links with the exception of possibly two nodes. This is a direct consequence of the fact that all the symbols in itinerary s n , except the first and last, are connected to their predecessor by an incoming transition and their successor by an outgoing transition. Otherwise we have that k in Provided a sufficiently long time series, equality follows in the limit k out W (i) 1. Observe that the out-strength distribution of the network obeys Eq. (18), the left-eigenvector equation which element-wise takes the form Consider the left-hand side of Eq. (20) Evidently, the out-strength (and in-strength) of multigraph W is a left-eigenvector and can be made identical to the stationary distribution p via normalization. Figure 7 shows the Ikeda attractor (μ = 0.9) with state-space points of the chaotic trajectory colored according to the relative out-strength of the corresponding node in the OPN computed for m = 10.
Since the nodal out-strength corresponds to a symbol s i in the itinerary, we mapped colors to the first value x i in the window x (m) i .

A. Periodicity in ordinal symbolic itineraries
The first test that the OPN construction is subjected to is related to its capacity for accurate representation of periodicity. We examine network topology separately for the case of (a) discrete-time and (b) continuous-time systems, as sampling frequency plays a central role in the latter case.

Discrete-time dynamics
Numerical results show that periodicity of a trajectory is captured by the ordinal symbolic itinerary s n , although not necessarily at the original period. We present evidence that the corresponding OPN also preserves dynamical topology. Random walks from an arbitrarily chosen initial node recover the original symbolic trajectory with full certainty, provided a necessary and sufficient condition on m. If unfulfilled, then symbol sequences of lower period arise. We also make a preliminary case for selecting maximal overlap (slide lag w = 1) as it exhibits minimal sensitivity to the value of m, in contrast to all other overlapping variants.
In the trivial case of an equilibrium state, a time series comprises of a single repeated value. This results in a singlesymbol itinerary regardless of chosen parameter values. The corresponding OPN constitutes of one node and a self-loop linking the node to itself. To consider regimes of longer period, we initially examine maximally overlapping symbolization.
Trajectories of period 2 k for k ∈ N lead to directed cyclic graphs of size 2 l where l ∈ N such that 0 l k. The case l = 0 corresponds to the uninteresting m = 1. If m 2, hence l = 0, then a random walk-irrespective of starting node-produces a 2 l -periodic symbolic itinerary since only one possible path exists within the network. This path is the network space analog of the original dynamical orbit. This is also the case for 3-and 5-periodic regimes, which leads to the hypothesis that orbits of any integer period may be captured by OPNs.
For instance, Fig. 8  itinerary with probability equal to P = 1. This finding is confirmed for time series generated by analogous parameter regimes (found within the period-doubling cascade regions in each) of the 1D cubic and the 2D Henon and Ikeda maps. Figure 9(a) portrays the network size V (m) for various values of m using maximal overlap. It is evident that the condition m > 2 k−1 is required to ensure the recovery of the original periodicity. Otherwise, the network's size will be equal to a submultiple of the original period and a random walk will produce symbol sequences of lower periodicity. This is due to the fact that an increasingly finer resolution of state space is necessary to encode the correct period. This signifies the requirement for an infinitely fine partition to fully encode the original dynamics if the accumulation point of transition to chaos (2 ∞ periodicity) is under examination. Such an unrealistic condition in practice motivates further the case for stochastic descriptions and Markovian modeling of low-dimensional dynamics. Additionally, there are two other comforting aspects. First, even if the detected periodicity is lower than the underlying, ordinal symbolization never fails to incorporate the periodic nature of the dynamics into the network topology. Second, for maximally overlapping symbols, OPNs exhibit no sensitivity to the value of m as long as it is selected above the threshold value of half the original period, i.e., m = 2 k−1 or larger. The existence of a limiting topology in the periodic case using maximal overlap is confirmed by looking at the corresponding link densities in Fig. 9(c). Dynamics are fully captured in this case.
The situation is radically different for the other extreme of the range of w, nonoverlapping OPN variants. Counting patterns with zero overlap leads to network topologies of varying size, as shown in Fig. 9(b), and link density, as portrayed in Fig. 9(d), which depend on an algebraic relationship between the underlying period and the length of ordinal patterns. Convergence to a limiting structure as m increases can, thus, never be established. There is no threshold value of m beyond which OPNs attain a constant structure. Both size and link density are fluctuating in an oscillatory fashion with increasing m, a fact that is true for all values of the slide lag w = 1. In fact, the network collapses to a single node if m is a whole multiple of the underlying period, with link density attaining the maximum allowed value of 1 (set to unity by convention since the network comprises a single node linked to itself).

Continuous-time dynamics
Encapsulation of periodic dynamics within an ordinal representation manifests in a different manner in the case of continuous-time flows. We discuss an example drawn from the Rössler equations. In particular, OPNs computed from trajectories of a periodic and a 4-periodic (Fig. 10) regime for various m do not lead to the directed k-cyclic structures observed for periodic regimes of discrete-time flows. Instead, resultant networks exhibit a 'ring'-resembling topology when m is large.
The reason for this type of imperfect-but simultaneously strongly recurrent, almost regular-topology is twofold. First, the continuous-time flows underlying the networks were sampled at approximately, and not exactly, 20 points per cycle (T periodic mean = 6.1746, T 4−periodic mean = 6.2078 and t = 1 20 T mean ). Therefore, integer m is required to be either a multiple of the mean cycle period T mean or an exact factor (since T mean ∈ R, by "exact factor" we mean a divisor such that the quotient is a positive integer). of T mean . Second, comparatively larger values for m are necessary to reveal the original periodicity than in the case of discrete dynamics. For instance, were we able to sample the periodic regime at exactly 20 points per cycle, a value of m = 20 would be necessary to ensure that the corresponding network is (the correct) trivial structure of a single node and a single self-loop. One simple way to overcome this issue is by studying an appropriate Poincaré map, often referred to as Poincaré Surface of Section (PSS), of the continuous flow. We computed the x max return map for the Rössler regimes using the T ISEAN package [70] (here the PSS corresponds to the x axis in tangent space [41]) to verify this. We observed that depending on precision of its values, an appropriate thresholding, i.e., a cut-off of decimal points exceeding the number of significant digits, leads to the recovery of the directed k-cyclic graphs observed for discretetime flows.

B. Sensitivity analysis
Three methodological free parameters are involved in the construction of an OPN. The length of ordinal patterns m dictates the size of the symbolic alphabet (equal to m!), the level of resolution of the ordinal partition-which admits an inverse relationship with the alphabet's size (equal to 1/m!)and the dimension of embedding space. The time delay τ dictates the lag involved in the embedding transformation. Slide lag w is equal to the number of nonoverlapping points between successive patterns within the itinerary s n . The main examples chosen for the purpose of our demonstrations are the logistic map [71] and the Lorenz flow [72] within wellstudied parameter regimes (see Appendices B and C for the full collection of tested systems in discrete and continuous time).
The existing OPN studies suggest different procedures for selecting parameters. Small [7] propose using the graphical inflexion of the V (m) curve or the maximum of the network link entropy (average node-link entropy) curve for identifying the optimal value of m. The values τ = 1 and w = m are used for the other two parameters without providing strong reasoning. Sun et al. [26] performed their analysis with an arbitrary choice of m = 8 by arguing that networks computed with increasing m should in principle belong to the same class. The authors use τ = 1 and w = 1 without providing a rationale behind this choice. McCullough et al. [19] suggest the interval m ∈ [6, 10] as the most useful choice based on their empirical findings from testing model systems. The value for lag τ is selected based on the first zero of the autocorrelation function, whereas the slide lag w = 1 is picked without any motivation. Masoller et al. [27] select values based solely on the premise that large m will lead to undersampling. Hence, the arbitrary choice m = 4 is favored due to the small alphabet that it produces. The focus of this study is to examine short experimental time series, therefore no further consideration is given to determining parameter values and the choice (τ, w) = (1, 1) is not discussed. Similarly for Kulp et al. [10] who only propose a mechanism for selecting m by increasing it until a nonzero CFP is observed. McCullough et al. [9] vary τ , use w = 1 without discussing and pick m a posteriori by computing some metric on the network and observing its variation with increasing m. Zhang et al. [28] opt for m = 2 as its interpretation is simpler than higher values within the multivariate framework, τ = 1 and w = 1 without supporting arguments. Guo et al. [29] do not discuss parameter selection.
In contrast, we make recommendations based on the sensitivity analysis that we conducted. Just as network visualization suggests (e.g., see structures computed with different m for the 4-periodic Rössler regime portrayed in Fig. 10), the value of m is of critical significance. A spectrum of values lying within a scaling region is appropriate. We, thus, recommend the ensuing analysis (Sec. V B 2) as a first step of application to a newly presented dataset. Furthermore, we posit that the set (τ, w) = (1, 1) is optimal based on arguments of (a) observing false admissible patterns and (b) exploiting short-term correlations to impose constraints on the admissible transitions. Additionally, we investigate the effects of undersampling and discuss facets related to this constraint.

Time delay
In this section we motivate the particular choice of the value τ = 1 for the time delay, irrespective of the dataset at hand. In the case of discrete-time flows, the value of unity comprises a natural choice since sample points are essentially obtained at arbitrary precision (or, in practice, as far as numerical precision will allow). The first (and global) minimum of the mutual information between the time series and its history occurs at the value of 1 for discrete chaotic maps.
In the case of continuous-time flows, the decision is more complicated. If a longer time delay is required, then several heuristics for appropriate choice exist in the literature, most notably the first zero of the autocorrelation series (Ch. 1, Sec. 1.3 [73]) and the first minimum of the mutual information (Ch. 1, Sec. 1.4 [73]). System-dependent choices have also been proposed in the ordinal network literature [19]. Sometimes it is preferable to vary the time delay-especially when faced with multiscale dynamics-to identify the various time scales, as in the complexity-entropy causality plane analysis by Zunino et al. [74] or the ordinal analysis implemented by McCullough et al. [9] and applied to human cardiac dynamics.
We propose to opt for the τ = 1 value via examination of an associated PSS. This renders the choice as natural as in the case of discrete-time flows. Our choice enables avoiding undesired aliasing effects [9], which is important to prevent the presence of false admissible patterns which do not reflect the actual deterministic dynamics. The absence of aliasing in itinerary s n ensures the more accurate approximation of transition probabilities and connectivity in the resulting network.

Pattern length-Embedding dimension
Various ways for finding an optimal value for m have been suggested in the literature, most notably the graphical inflexion and maximum network link entropy propositions in the original paper [7]. McCullough et al. [19] suggested an alternative which considers the first two moments of the degree (vertex connectivity) distribution in combination to visual inspection. However, numerical evidence in this assay suggests that useful information may be obtained by considering various m values within an "optimal range." Our results indicate the existence of a scaling region. It is the equivalent of a combination of the scalings observed for various embedding dimensions and spatial scales in traditional embedding practice towards invariants' estimation. Therefore, we propose that instead of looking for an optimal value, it is more informative to focus on this region in parameter space and consider the interplay between mutually conflicting practical considerations, such as undersampling, on the one hand, and increasingly finer partitioning, on the other hand. All values of m within this region are satisfactory and then computational considerations may be the determining factor, i.e., picking the smallest possible value. Figure 11 shows a plot of the total number of nodes for 2 m 35 on a semilogarithmic scale. The corresponding OPNs were computed from a time series of N = 10 7 points generated by the logistic rule x n+1 = 4x n (1 − x n ). The inset figure on the bottom right portrays the V (m) curve on linear axes. We observe a sigmoidal curve, whereupon it seems that The semilogarithmic plot shows that V is actually increasing exponentially fast for all m until the presaturation region 20 m < 25. Entering this region of parameter space signifies the transition from exponential to linear growth. Increasing m by a single unit produces one to two million additional nodes. We hypothesize that an OPN can provide a meaningful simplification to the underlying dynamics until some stage past the inflection point in the V (m) curve. In the case of the chaotic Lorenz flow (sampled at discrete times, not the PSS), the inflection point usually occurs when m is slightly larger than an average period of the system. Although the rate of information incorporation into the network starts decreasing at this stage, some information about dynamical evolution may still be encoded within. Once m > 25, the network size exhibits gradual saturation to a constant value equal to N − m + 1 N, the total number of segments extracted from the time series. This transition signifies the collapse of the network representation to a single string. In this formulation, all nodes have strictly two connections to those representing the preceding and succeeding dynamical state, with the exception of the initial and final state which have one only. It is a completely trivial type of network without much use. Thus, once m represents a sufficiently large history of a trajectory in state space, the network is no longer a useful representation of the dynamics. The culprit is data length insufficiency. There is a limited amount of transitional information in the sense that far more connections between dynamical states occur due to the increasingly finer partitions but at a much lower frequency (recall that m affects the resolution limit (m!) −1 in this type of partition).
To explain these observations, consider the pruning rule of forbidden patterns associated with deterministic systems [56]. Given infinitely long data and as m increases, the number of admissible patterns grows exponentially while the number of forbidden patterns grows at a super-exponential rate. Therefore, although the V (m) curve is monotonically increasing, the relative count of admissible patterns P (m) a is actually decreasing rapidly. Finiticity of data renders this effect more pronounced, which implies we are possibly sampling only a portion of the real network. Undersampling occurs if m is larger than a certain threshold given fixed data length N. The heuristic condition N m! + m − 1 is required to ensure an adequately large data set in comparison to a specified partition resolution (m!) −1 . Hence, up to m 10 we have may sampled the network in full (10! = 3, 628, 800 and N = 10 7 ), while only partially for larger m.
Furthermore, admissible patterns grow exponentially as m increases. Bandt et al. [23] proved that for piecewise monotone one-dimensional maps the rate approaches the topological entropy of the evolution operator h TOP (φ). In the case of continuous-time flows, this relates to the entropy of the discrete map sampled at regular intervals acting as the numerical approximation of the continuous system. Therefore, the , with a value approximately equal to 0.79085 for the logistic system. This can be contrasted to the actual value equal to log (2) 0.6931. Disagreement between the linear fit and the true value also holds for Lorenz. In Ref. [34] we show that a network-based measure is better equipped to accurately capture this evasive dynamical invariant.
In summary, the domain of the parameter m contains three nontrivial qualitatively different regions in terms of the generated network topology-and a trivial one (saturation). First of all, if the pattern length m is not sufficiently large, then the size of the symbolic alphabet is too small to uncover the true complexity of the dynamics. This is akin to a histogram of very coarse resolution in traditional partitioning of time series data. The simplification is inadequate. Second, when m is larger than a critical value-presumably dependent on the dimensionality of the system and in particular the active degrees of freedom as established by Sauer et al. [61] in traditional embedding theory-the ordinal partition is finer and can capture the dynamical evolution of the system in a more elaborate manner. Very fine transitions from a dynamical state to another, which could not be captured by choosing smaller m, can now be incorporated into the network representation. In a practical setting, given finite data of possibly rather limited length (e.g., notoriously, geoscientific measurements), a heuristic rule consists of choosing as long a pattern length as the time series length will permit so that severe undersampling does not occur. Finally, within the third qualitative region, the network grows linearly with m and fewer newly formed nodes appear beyond the inflexion point. This point signifies that the difference between V (m inflexion ) and V (m inflexion − 1) is maximal. The ensuing decreasing rate of information incorporation into the network representation may then give rise to issues similar to those of over-embedding [7]. Due to this phenomenon and increasingly greater undersampling, the produced networks can no longer provide a meaningful simplification of the actual dynamics in play.
The above is further supported by looking at the link density of the computed networks which drops with increasing m, as depicted in Fig. 12. In fact, the semi-logarithmic scale plot indicates exponential decay. This result relates to the "existence" of forbidden transitions between symbols of the alphabet associated with given deterministic dynamics. Such pruning concerns the product space of ordered pairs of patterns rather than the space of ordinal patterns itself. Note that the absolute value of the decay exponent in link density is very close to the growth rate exponent of the network's size.
An similar situation transpires for the nonoverlapping variant which suffers even more significantly for large m. A radical drop occurs at approximately m = 25. In both variants, for m 25, we observe a decay to 10 −7 , which occurs due to the summation term in Eq. (A1) approaching V − 1 since the network comprises a long string of V = 10 7 nodes connected by single directed links between predecessor-successor pairs of patterns. In fact, the large rise (and drop) in V (m) (ρ E (m)) discerned in Fig. 11 (Fig. 12) is also detected by measures such as PE, sorting entropy (SE; see Ref. [22]) and the entropy of the joint distribution of pairs of patterns in an analogous fashion (results omitted). The observations for w = m networks, in contrast to w = 1, suggest a greater sensitivity displayed by the joint distribution of ordered pairs of patterns-in comparison to the more robust distribution of patterns-on the amount of overlap, as demonstrated in the following section. Figure 13 depicts the two fundamental network observables for all possible overlap scenarios. It is evident that the effects on the network size V are negligible. However, measures which depend on the joint probability distribution of successive pairs of patterns exhibit sensitivity to the amount of overlap. This is highlighted by link density ρ E in the right panel of Fig. 13. It increases with increasing slide lag, i.e., decreasing amount of overlap. This inverse relation implies an increasing diversity of forward-time transitions (and hence links in the network) as successive patterns become decreasingly overlapping and therefore less and less correlated.

Slide lag
Another interesting observation is the saturation of ρ E beyond a certain slide lag. This may be explained by the fact that the number of forward-time transitions does not vary in total once the level of correlation between successive patterns has decayed to zero. This effect can only be observed for higher m values which allow complete decorrelation between consecutive symbol pairs. Since the value is similar irrespective of m once m is sufficiently large, it cannot merely be attributed to undersampling. This artefact has an interesting repercussion; it may be used as an indicator of the correlation time of a chaotic system and act as an ordinal alternative to commonly employed estimates such as the first zero (or decay time) of the autocorrelation function-which is always zero after a single step for discrete-time flows examined here since there is no linear correlation, the first minimum of the self-mutual information and the quarter-period length.
Consequently, variation of w has no impact on the distribution of ordinal patterns, but it does affect the joint distribution of ordered pairs of successive patterns, on which correlation and/or undersampling effects are manifested in a pronounced manner with increasing m. While w = m produces a richer diversity of admissible transitions, this is not necessarily beneficial. We therefore suggest the use of maximal overlap, thereby restricting the set of possible successor patterns. The networks produced are less sensitive to changes in m. Also, whereas the number of admissible patterns eventually observed is the same irrespective of w (see Sec. V B 4), reaching the true number of patterns for a fixed pattern length occurs for a much smaller sample size N in the case of the maximally overlapping variant. The conditioning on predecessorsuccessor pairs helps reduce undesired effects due to the Markovianity of the dynamics not being fully preserved in the projection to the symbol space-as with any dimensionality reduction procedure-unlike the embedding transform where topological conjugacy can be guaranteed.

Time-series length
It is self-evident that the longer a dataset, the more information an observer possesses about a system. In practice, however, measurements constitute small to intermediate-sized datasets in several disciplines. In the framework of ordinal patterns, small samples may lead to undersampling due to the super-exponential growth of the symbol space as m increases, e.g., 10! is approximately 3.6 million. Essentially, there is an interplay between finer resolution and insufficiently sampled regions in the ordinal partition Q. The situation is even worse given that our aim is to estimate dynamical invariants from a single long trajectory rather than an ensemble of trajectories from randomly sampled initial conditions. This depends not only on the ergodicity of the underlying system, but also the capacity of the ordinal probability measure to preserve the ergodic property. The following results demonstrate that ordinal representations can prove faithful in the case of lowdimensional chaotic dynamics.
The relevant literature on the topic of undersampling mainly revolves around forbidden patterns and their outgrowth ratios [56] or PE, mostly in the context of white noise. An analytical treatise [75] illuminated the exact relation between PE of a Gaussian noise scalar time series to m and N, which obeys a χ 2 -distribution of (m − 1)! degrees of freedom. Several results on PE of fBM, fGN, and ARMA processes were obtained by Bandt and Shiha [76], but without any consideration of the dependence on the number of sample points N.
Here we are mainly interested in the effects of N when computing OPNs of increasing m from chaotic or hyperchaotic dynamics. The most commonly employed heuristic for avoiding undersampling is to choose a value for m such that the sample size is significantly larger than the number of symbols, equal to m!. For maximal overlap (w = 1), this FIG. 14. Number of nodes vs. sample size N. Evolution of the network's size as the time series length increases. Network construction using maximal overlap (w = 1; solid lines with markers) and zero overlap (w = m; dashed lines) between successive patterns with m = 8, 10, 12, 14, 16. leads to N m! + m − 1 [56]. In the general case, N w(m! + 1) + m which has also led to the recommendation N 5m! [57,77]. However, our numerical experiments indicate that undersampling may be avoided even at a fraction of these sample sizes in the case of both dissipative as well as expanding low-dimensional chaotic flows. Figure 14 shows the network size of OPNs computed from the (r = 4) logistic time series for various m as a function of N. When N is small, the network size increases linearly in the double-logarithmic scale of Fig. 14. This indicates a power-law relationship between sample size N and the number of distinct admissible ordinal patterns. In the maximal overlap case, the exponent is approximately equal to 1, i.e., there is a linear growth. In the nonoverlapping case, V ∼ N 1 10 , which shows slow growth and confirms the necessity for much longer trajectories (result omitted).
These findings are also evidenced by the saturation to limiting values (different for each m as expected), whereupon the maximal overlap variant attains a constant profile much faster. The surprising robustness to undersampling effects is apparent by considering m = 8 (circular markers) for instance. Although 8! = 40320, even approximately 2000 sample points are sufficient to obtain an accurate estimate of the network size. This is confirmed by looking at the corresponding link density in Fig. 14(b), whereby the saturation is now at lower values since the network becomes less dense relative to its size. For m = 8, a limiting link density is attained at about 3000 samples points. Finally, we remark that the differences resulting from the amount of overlap are perspicuous in these two fundamental structural network characteristics. There is, however, convergence between the two extremes of maximal and no overlap for adequately long time series.

C. Network topology
A comparison between networks computed from trajectories of different dynamics shows that a different topology is obtained-with the exception of when two regimes are topologically equivalent. A different size and link density is observed for the logistic map and time series from a chaotic cubic map, a Gauss map, the bent Baker's map, 062307-16 or the two-dimensional Lozi, Henon, and Ikeda maps. In contrast, the one-dimensional tent map and the dyadic (or bit-shift/doubling) map led to networks exhibiting similar structure. Computations on scalar x-component time series of a subset of these are displayed in Fig. 15. The exponential scaling behavior observed previously is apparent for all the maps considered, although at different rates. This, we conjecture, relates to the complexity characterizing the underlying dynamics. However, the width of the scaling region can be elongated in dynamics where the diversity of symbols is lower (smaller topological entropy), e.g., the chosen chaotic regime in the Lozi map exhibits a particularly long scaling region and markedly fewer patterns for respective m values. Its twodimensional attractor is very thin with a particularly small Lebesgue measure.
We observed that similar conclusions hold for the two different types of PSS that we investigated in the case of the Lorenz attractor. Irrespective of the observable, all three components-and even the time series of the corresponding return times-produced similar network topologies. Evidently, spatial characteristics of the original attractor are present in the networks' structure depiction. The size of the network and link density as a function of m (Fig. 17) resemble the behavior that we observed for the logistic map. This is not so surprising in the case of the z max PSS since it is very similar to the tent map (which we have also tested and obtained full agreement to these results) as the two discrete maps are topologically conjugate. However, the remarkable observation concerns the capacity of both types of PSS, the one-dimensional z max as well as the two-dimensional z = 27, to generate very similar network topologies. Furthermore, any associated observable, for instance the corresponding x and y coordinates of the successive z maxima produce almost the same network. This result is further supported by exact quantitative estimates of the Lyapunov exponent [33] and the topological entropy [34] of the system. Unexpectedly, even series of recurrence times extracted from either type of PSS can also be used to procure similarly useful network topologies which are representative of the dynamics underlying the Lorenz attractor.

Out-strength distribution of directed multigraph W
In Sec. IV C we demonstrated an analytical connection between the node-wise relative out-strength distribution k out W of directed multigraph W and the stationary distribution of S. This connection is illustrated numerically for the logistic map example on the top panel of Fig. 18. First of all, observe the fat-tails characterizing this empirical distribution. A linear fit is shown on the inset logarithmic-scale plot. The slope is rather steep (estimated exponent γ 4.21 which is outside the typical range of 2 < γ < 3), but the structure resembles an empirical power law, i.e., a scale-free distribution. This observation is further supported by the associated plot of the cumulative distribution function, shown in the left-bottom panel.
We do not posit that OPNs computed from chaotic time series possess scale-free properties, as statistical analysis on empirical distributions has refuted many of the claims on reported scale-free networks and seriously questioned others [78]. In addition, simplistic mechanisms such as preferential attachment have absolutely no meaning within our context. Instead, we focus on the connection to the Markovian stationary distribution. It is likely that the scaling near the tails of the k out W distribution is a consequence of the singularities in the natural measure of the logistic map near the end-points of the domain. Since a higher concentration of mass occurs in these two regions of the one-dimensional state space, trajectories pass through this portion of space more frequently. Therefore, the symbolic itinerary s n produced by an accurate embedding onto a higher-dimensional space should, in principle, reflect this property. This, in turn, would produce a higher number of adjacent links to the corresponding nodes of the network, i.e., those nodes which represent the relevant ordinal patterns in s n -and by extension the elements of the partition Q of M IMG .
Furthermore, the bottom-right panel of Fig. 18 reinforces this hypothesis. It displays the number of time series segments x (12) i , which have been mapped to the two largest hubs of the network. As hubs we chose to collect the nodes whose number of outgoing connections k out W is larger than 5 000 (there are two of them) and 6 000 (one of these two). Mapping back to the original state space, via labeling of the first coordinate of the 12-dimensional space, is illuminating. All trajectory points which the OPN algorithm has mapped to the partition element represented by the node with k out W > 6 000 are found very close to the end-points of interval [0,1], i.e., near the singularities of the invariant measure. Points mapped to the 5 000 < k out W < 6 000 node are found in the vicinity of the most dominant unstable periodic orbit (of order 1), i.e., the least unstable, which occurs at the fixed point x = 0.75.

Projection of ordinal partition onto 1D state space
To examine the above hypothesis in a more elaborate manner and understand the mechanism behind the proposed connection between k out W and the invariant density of the underlying dynamics, we present some indicative results on mapping back nodes to the original state space. Specifically, we illustrate the projection of a higher-dimensional ordinal partition back onto the original one-dimensional state space [0,1] of the logistic map for the intuitively simple and accessible cases of pattern length m = 2, 3, and 4.
All nodes of the resulting network have been projected back onto the original state space by inspection of the first coordinate of the two-, three-, and four-dimensional embedding space, respectively. The location of these points is shown on the horizontal axis of the three panels of Fig. 19, while the vertical axis represents the index of the node/pattern that these points correspond to, normalized by network size V so as to constrain the range within the interval [0,1]. These values are arbitrary and have been so chosen purely for visual purposes and facilitation of distinction between the different elements of the projection. For instance, the top-left panel portrays the projection of the m = 2 partition. There are two nodes, say v 1 , v 2 , corresponding to ordinal patterns π 1 , π 2 and partition elements Q 1 , Q 2 (shown in Fig. 4) of two-dimensional embedding space. In this case, V = 2 and index i = 1 or 2. Consequently, the vertical axis which represents i/V can take values from the discrete set { 1 2 , 1}. We have juxtaposed the zeroth, first, second and third iterates of the logistic function as well, respectively, on the plot of each panel.
Concentrating, first, on the m = 2 case, we notice that the "map-back" projection has separated the state space into two uneven intervals of Lebesgue measure equal to 0.75 for pattern π 1  Finally, the same trends are observed for the m = 4 partition projection, portrayed in the bottom panel of Fig. 19. Here there are 12 admissible patterns and another 12 forbidden ones. As previously, boundaries between elements of the projection coincide with either (a) periodic points of order 1, 2, and 3 or (b) eventually periodic points. An important observation is related to the length of each interval which, although unequal to the others, displays a decreasing length in comparison to length of the elements in the m = 2, 3 projections. This implies that the diameter of all elements of the ordinal partition projection onto the original state space decreases with increasing pattern length m. This is of significant value to the accuracy of the stochastic approximation, the Markov process S, to the underlying deterministic dynamics. Refinement of partition producing smaller elements in the Lebesgue measure constitutes a common requirement for traditional transfer operator approaches [30,31]. Note that one difference between the m = 4 case and the former two is that three elements of the projection, in particular those corresponding to the regions Q 4 , Q 6 , and Q 9 in four-dimensional embedding space-vertical coordinate equal to 4 12 0.333, 6 12 = 0.5, and 9 12 = 0.75, respectively, in the bottom panel of Fig. 19-comprise the union of two disjoint intervals. This does not alter results or create any issues, but it does pose a peculiar property related to such partitioning schemes as with ordinal symbolization.
One final remark is that the inverse mapping procedure described above, can be used to identify the location of all periodic points of arbitrary order within a dataset. This also holds true for the location of all eventually periodic points. In contrast, if the underlying dynamics is dictated by a system of higher dimensionality, this procedure is no longer efficient for localising periodic orbits.

Nodal clustering
Very few nodes have nonzero local clustering coefficients (proportion of triangles over triplets passing through). For example, the OPN computed from the logistic map with m = 12 contains 10 525 nodes, while only 19 out of these do not exhibit zero clustering. Figure 20 displays the histogram of all such nodes, divided in two classes. The set of nodes with very low clustering (C < 0.1) appears in red, whereas FIG. 21. Histogram of points that have been mapped to symbols corresponding to those nodes in the network with low clustering coefficient (C < 0.1). Gray dashed lines represent the 6 unstable periodic points of prime period 3. nodes with higher clustering are shown in blue. We noticed that time-series points located in close proximity to the two fixed points of the map have been mapped to the latter class of symbols/nodes (blue).
In contrast, the former class (red) consists of symbols onto which points that lie close to the two 3-periodic orbits have been mapped. In fact, all six points with prime period 3 are encapsulated by this set of nodes. The histogram in Fig. 21 can be used to verify this as the gray dashed lines represent the exact location of all nondegenerate 3-periodic points.
This result is very indicative of the type of information encoded within the cyclic structure of OPNs. We delve deeper in the following section by considering cycles of arbitrary length. In summary, nodes characterized by nonzero clustering represent either fixed points of the underlying dynamics (higher values) or points with prime period 3 (lower values).

k-cycles in unweighted directed network A
Detecting the paths composed of k links from a node to any other within a network comprises a rather simple and efficient procedure. Given knowledge of the adjacency matrix, one need only look at the kth power of the matrix. Its (i, j) entry provides the number of all such walks from node i to node j. If specifically interested in the cyclic paths, i.e., those paths which end on the starting node whereby we have j = i, the so-called k-cycles, then the main diagonal of the kth power is required. Within the ordinal context, detecting k-cycles is rendered by the connectivity matrix, the adjacency matrix of the unweighted directed network A. A k determines all the possible dynamical transitions from region Q i to Q j in embedding space within k time steps. Figure 22 shows the base-2 logarithm of the number of k-cycles for the OPN computed from the logistic trajectory with 3 m 12. It is clear that increasing m shifts the empirical curve closer to the diagonal, i.e., the expected curve since there exist 2 k unstable k-periodic points when r = 4. We utilize this result to detect the location of unstable periodic points in discrete maps via finer sieving obtained by increasing the pattern length.
We illustrate our results by considering the situation of k = 4 for the logistic map, which is amenable to an analytical approach for all r values [71]. As a proof-of-concept for our conjecture, Fig. 23 displays a 64-bin histogram of those time series points which have been mapped to ordinal symbols belonging to 4-cycles in the computed network (m = 6). Gray dashed lines represent the 16 locations shown in Table II. It is evident that the vicinity of unstable periodic points of order k = 4 can be captured by the connectivity pattern of an OPN.
Increasingly finer sieving should, in principle, produce more precise results and shed light to the voluminous image portrayed in Fig. 23. As expected, Fig. 24 demonstrates that longer patterns (m = 12 here) are able to produce networks whose cycles encapsulate unstable periodic points embedded within a chaotic attractor-using the term slightly loosely here since the "attractor" in the logistic (and any expansive) map is composed of the entire state space, but see the following section for an example of a dissipative continuous-time flow.
The histogram of Fig. 24 illustrates three important facets of the described procedure. First, the aforementioned precision which implies that ordinal partitions of finer resolution allow one to detect unstable periodic points from a scalar time series with arbitrary precision. Second, the frequency (height) of each bin is dictated by the order of the periodic point, i.e., the lower the order, the more dominant the unstable periodic point. Hence, more time series points may be found in the local neighborhood thereof. Finally, four points have been missed which indicates that a longer trajectory may be required to approach these periodic points sufficiently close and thereby allow the network to detect their location. Sec. V D illustrates the effectiveness of the proposed technique towards detection of unstable periodic orbits (UPOs) in continuoustime flows.

D. Detecting unstable periodic orbits in 3D chaotic flows
We use the Lorenz system as an example, in particular the two-dimensional z = 27 PSS. Ordinal networks were computed from the first component for various values of m and we focus on 4-cycles as previously. There are 18 such entries which are nonzero in the case of m = 14 which we use as a case study (Fig. 25). Collecting all the time series segments x (14) i mapped to the diagonal entries of A 4 , the plot displays a histogram of the first of the 14 coordinates in embedding space. For comparison purposes (see Fig. 27 below), we have chosen to show frequency, and not its relative counterpart, on the vertical axis-as opposed to the results on the logistic map. Clearly, the symmetry of the Lorenz attractor is captured in this figure. We propose that the peaks represent the local neighborhood of the unstable fixed points around which the lobe dynamics of the attractor are revolving, as well as the lowest-order UPOs. Moreover, the highest peaks correspond to points of the PSS through which the elliptic trajectories near the fixed points transverse. The coordinates of the latter are, respectively, given by x FP,1 = ( √ β(ρ − 1), √ β(ρ − 1), ρ − 1) (8.4853, 8.4853, 27) and −8.4853, 27). The second-highest peaks are due to points in the vicinity of the only UPO of order 2, the most dominant of all the periodic orbits. It is symmetric with respect to the z axis and somewhat resembles the ∞ symbol, shown in the inset figure of the bottom-right panel of Fig. 27. The next-highest peaks correspond to UPOs of order 4 and so forth. Evidence in support of our conjecture are provided via Figs. 26 and 27. Figure 26 displays the right (positive) branch of the z = 27 PSS. Markers of four different colors and shapes have been placed at the location of all time series points that have been mapped to nodes v 1 , v 9 , v 11 , and v 17 from the set of the 18 nodes that form part of 4-cycles in the network. Nodes v 1 , v 18 are also identified as cycles of order 1, hence they correspond to the two fixed points. Nodes v 2 , v 5 , v 15 , v 17 form cycles of order 2. Therefore, we expect that v 1 encodes the neighborhood around fixed points, v 17 the neighborhood around UPOs of order 2 and v 9 , v 11 the neighborhood around UPOs of order 4. Blue circles represent the points mapped to v 1 , whose x and y coordinates may be compared to the first two components of x FP,1 . We conjecture that green triangular markers (mapped to v 17 ) correspond to points near the symmetric UPO of order 2, while red squares (v 9 ) and black diamonds (v 11 ), respectively, represent the state-space points in the immediate neighborhood of one of the two asymmetric UPOs and the single symmetric UPO of order 4. To examine the validity of our hypothesis, we inspect individual histograms of the x component of all points mapped to each of the nodes v 1 , v 9 , v 11 , and v 17 . Furthermore, we have randomly sampled these 4 sets of points drawn from the z = 27 PSS, picked a single example point and numerically integrated the Lorenz equations for 2-3.5 time units.
The results of the procedure described above are presented in the four panels of Fig. 27. In the case of node v 1 , there exists a multitude of time series points that have been mapped to this particular ordinal pattern, in the order of (10 3 ) or less (see the vertical axis of the histogram on the top-left panel of Fig. 27 which displays frequency). Since the sample size of the input time series is N = 1, 133, 155, these points form approximately 0.1% of the total. In stark contrast, points mapped to the other three nodes represent a very small proportion of the time series. Specifically, only 25 points have been mapped to v 9 (top-right panel), 3 to v 11 (bottom-left panel), and only 2 to v 17 (bottom-right panel). Additionally, notice how close the x coordinate (horizontal axis) of the points in each of these classes is to the rest of the members of the class. The inset figures in each panel demonstrate that the randomly sampled test points (one drawn from each class) are indeed extremely close to the aforementioned UPOS of order 4 (v 9 asymmetric and v 11 symmetric) and the single one of order 2 (v 17 ). These results indicate that an OPN computed from a scalar time series is indeed capable of detecting the most-dominant UPOs embedded within the chaotic attractor of a continuous-time flow at sufficiently good precision.

VI. DISCUSSION
Ordinal partition networks comprise mappings from a scalar time series to a complex network. The input data is assumed to be generated by a deterministic dynamical source and our formulation is focused on ergodic systems. Network construction is conducted via a very simple, computationally efficient algorithm drawn from symbolic dynamics. It involves counting ordinal patterns, i.e., permutations of the numbers 1, 2, . . . , m that represent order relations between points within short subsequences of the time series. The premise behind our proposition lies on a Markovian model which provides a stochastic approximation to the underlying dynamics. Our findings indicate that (i) the topology of the computed network encapsulates useful information about the dynamical features of the system and (ii) the mapping can be used to distinguish between different dynamical regimes.
In this work, we set out to test the extent to which information from the original time series is retained by the OPN transformation. Numerical experiments were conducted on an ensemble of systems and dynamical regimes. Our results suggest that the representation is faithful in the sense that connectivity patterns reflect characteristics of the dynamics and different dynamical regimes produce distinctions in the topological structure of the network. In addition, we only examined scalar time series and evidence suggests that any projection of a multidimensional flow can be used to extract a network representative of the underlying system. It can be deduced that any nonlinear monotonic transformation of any state variable will also exhibit the same ordinal structure. We presented an interpretation of OPNs that has a particular meaning within the context of ergodic theory and transfer operators. Through this lens, the network constitutes a natural representation of a discrete-time finite-state Markov chain. This allowed us to relate the topology to either the periodicity or the hierarchy of unstable periodic orbits embedded within an attractor, in the case of periodic and chaotic dynamics, respectively. In addition, our framework sets the stage for using OPNs to estimate characteristic indicators of chaotic behavior [33,34].
Three different flavors of OPNs were introduced and all can provide insight into the governing dynamics. The relative out-strength distribution over all nodes of the directed multigraph W is equal to the stationary Markovian distribution, obtained by the 1 normalized left eigenvector of T . The transition graph T contains information about the transition probabilities between the elements of the ordinal partition in embedding space R m , where m the denotes the length of ordinal patterns. In Ref. [33], we illustrate the capabilities of this flavour in a quantitative manner by estimating the Lyapunov exponent of the chaotic system which has generated the time series.
The connectivity matrix A represents all possible transitions from any element in the ordinal partition of embedding space to any other. We utilized it to localize the most dominant (lowest-order k) UPOs of the underlying chaotic dynamical regime. This was achieved via identification of the time series points which have been mapped to cycles of order k in the network. This technique is effective both for discrete maps and continuous-time flow data by means of Poincaré sections. Any type of section that provides a sufficiently accurate depiction of the underlying dynamics may be used, even something as simple and computationally efficient as recording successive maxima from any component of a multidimensional system. Additionally, in Ref. [34] the authors show how it can be used to estimate the topological entropy of the underlying discrete-time flow. The above highlights the benefits of using this network-based technique over simply collecting statistics from the symbolic itinerary and the complementary insight obtained.
Finally, our aim being to provide a computationally cheap tool to augment traditional time-series analysis, we have created a guide for application of this toolkit to given data. Examination of methodological parameters suggests that the embedding lag (τ ) and the amount of overlap between successive windows of the dataset segmentation (m − w) admit an optimal value. The former should be set to unity to avoid the aliasing effect, i.e., the occurrence of false admissible patterns that are not induced by the underlying dynamics. The latter parameter should be maximal, i.e., w = 1, to capture all possible information from the empirical joint distribution of time-ordered pairs of patterns. This leaves one free parameter, namely the pattern length (m) which also dictates the resolution of the ordinal partition, as well as the dimension of embedding space. Instead of an optimal value, we put forth the argument that there exists a scaling region, similarly to traditional embedding practice, where the resulting OPN captures the underlying dynamics sufficiently well. Within this region in parameter space, an OPN provides a useful stochastic approximation which can extract meaningful dynamical features from a scalar time series. An interplay between the sample size and the chosen value of m dictates the size of the optimal scaling region. Plotting the number of nodes as a function of the sample size N-for given time series using increasingly larger subcomponents thereof-for various values of fixed m can be used to determine this region.

Link density
Denoted by ρ E , it constitutes the second fundamental network characteristic. Traditionally defined by whereby self-loops (links from a node to itself) are not considered. This is equivalent to removing the main diagonal from adjacency matrix A. We have repeated the entirety of our analysis with a variant which includes self-loops. The normalizing constant changes to V 2 in this case. No differences in results were observed, with the exception of absolute density values.

Out-degree
Denoted by k out A (i), it is defined as the total number of outgoing links adjacent to node i according to connectivity matrix A. Also referred to as degree centrality.

Out-strength
Denoted by k out W (i), it comprises the extension of the outdegree metric to weighted networks. It equals the number of outgoing links from each node in the multigraph W , i.e., total number of transitions from a given pattern observed in itinerary s n . In the case of the transition graph, we have k out T (i) = 1 by definition.

Clustering coefficient
The coefficient of nodal clustering is defined as the proportion of directed 3-cycles over all possible directed node triplets passing through a node in the network. In set builder notation it is given by Note that the denominator equals k in A ( j) · k out A ( j), i.e., the number of serially connected 3-node subgraphs passing through j. The clustering coefficient of the network is the first moment of the distribution of C over all nodes, i.e., C clust = 1 V j C.

Closeness centrality
It represents the relative ease/speed of reaching any other node from a specified node in the network on average. It is defined as where d i j denotes the number of links that the shortest path from node i to j is composed of.

Eigenvector centrality
A natural extension of the concept of out-degree to include connections of length greater than unity. Within a social context and in undirected networks, it quantifies the indirect importance of a node via its links to other important nodes-a measure of influence. Within our setting, EC provides a measure of centrality for each symbol in s n -hence, each element of partition Q-by taking into consideration arbitrary-length transitions to other elements of Q. This highlights basins of attraction according to the Markov model. It is given by the right eigenvector of the connectivity matrix A, i.e., EC(i) = j A i j EC( j). (A4)

k-cycles
Paths from a specified node to itself that consist of k links. This is equivalent to self-loops when k = 1 and bidirectional links when k = 2.

Logistic map (1D)
The one-dimensional (1D) logistic map with parameter r ∈ [0, 4] is given by the equation We focus on the r = 4 regime, whereby dynamics produce (i) expanding properties (true for all r > 1), i.e., volumegenerating in the sense of the Lebesgue measure, (ii) chaotic behavior (true for all r > r c 3.57), (iii) strong mixing properties, and (iv) the existence of 2 k unstable k-periodic points that are dense in the unit interval as k → ∞. It constitutes a measure-preserving transformation of the 1D state space [0,1] to itself that is ergodic. The unique natural invariant measure, which is absolutely continuous with respect to the Lebesgue measure, is given by the density function which admits two singularities near x = 0, 1.
In discrete-time flows, fixed points satisfy x n+1 = x n and, in this case, are located at Fix( f ; r) = {0, 1 − 1 r }. The origin exists ∀r ∈ [0, 4] and is stable for r ∈ [0, 1), while the nontrivial fixed point only exists (in a physically meaningful sense) for r > 1 and is stable for r ∈ (1, 2). The set of periodic points occur when x n+k = x n . If they are of prime (or nondegenerate) period k, then we denote them by Per k ( f ; r) = {x| f k (x) = x, f j (x) = x for 0 < j < k}. The set Per 2 ( f ; r) consists of points which satisfy the equation −r 2 x 2 n + r(r + 1)x n − (r + 1) = 0, namely, while Per 4 ( f ; r) is composed of points satisfying a 12th-order equation (omitted). Table I shows a gamut of parameter values which generate inherently different dynamical regimes in the logistic map. Table II shows the exact locations of fixed points and periodic points of prime period 2 and 4 when r = 4.

Cubic map (1D)
We mainly examined the regime of "postmerge crisis" (PMC) chaos when a = 4, whereby the equation generates behavior which admits the existence of an absolutely continuous measure (with respect to Lebesgue).

Tent map (1D)
We examined iterations of the equation which produces chaos for μ = 2 − 10 −8 and approximates behavior generated by f 2 . Integer parameter values lead rapidly to convergence to the origin x = 0.

Henon map (2D)
We examined the chaotic map given by  96, b) for b > 0 very small. The latter regimes have been shown to be ergodic and to generate a nonhyperbolic attractor which admits a unique SRB measure.
APPENDIX C: CONTINUOUS-TIME FLOWS

Lorenz system
The three-dimensional (3D) continuous-time flow generated by simulation of the Lorenz equationṡ z = xy − βz, is investigated using the original parameter values (σ, ρ, β ) = (10, 28, 8 3 ) which produce a "butterfly"-shaped fractal chaotic attractor. We compute two different types of a PSS. A 2D map can be obtained via recording the successive crossings of the Lorenz trajectory in theż > 0 direction with the z = 27 plane. Three time series are stored, the x and y component of the location of the crossings, as well as the recurrence times. A one-dimensional Poincaré map can be obtained by the famous technique set forth by Lorenz [72] using the maxima of the trajectory in one direction, e.g., the sequence of successive maxima in the z component of the continuous flow. We store all three components associated with the z max PSS, as well as corresponding return times.

Rössler system
We examined the system of equations given bẏ x = −(y + z), mainly for (a, b, c) = (0.398, 2, 4). This choice generates "broadband" chaos. Different regimes for (b, c) = (2, 4) can be obtained by setting the value of a according to Table III.

Chua circuit
Another well-studied example is the three-dimensional system of equations used to describe the dynamics of a simple electronic circuit made from standard components only (resistors, capacitors, and inductors). It exhibits classic chaotic behavior for the set of parameters (α, β, m 0 , m 1 , E ) = (9, 14 + 2 7 , − 8 7 , − 5 7 , 1). The resulting chaotic attractor is the so-called 'double-scroll' due to its shape. The system is given bẏ where the function φ-which describes the electrical response of the nonlinear resistor-is given by