[PDF] scikit-fda: Interactive Visualization and Analysis Tools for Functional





Previous PDF Next PDF



EX VIVO 3D TUMOR TESTING FOR TREATMENT RESPONSE

EX VIVO 3D TUMOR TESTING FOR PREDICTIVE DIAGNOSTICS. 12. Heatmap: ○ Colors represent deviation of AUC. ○ Selected set of features. ○ Relative sensitivity is 



3D Motion Capture of an Unmodified Drone with Single-chip

Python with PyTorch library. VI. EVALUATION. A. Localization Evaluation. Fig. 7 3D FFT and 3D MUSIC search for the peak in a 3D heatmap and hence more ...



OCHID-Fi: Occlusion-Robust Hand Pose Estimation in 3D via RF

We implement the OCH-Net and OCH-AL on Python 3.8.16 and Pytorch. 1.10.0. The Heatmap Regression. In Proc. of ECCV pages 118–134



Visualization and Analysis of Pareto-optimal Fronts using

PO data as a 3D scatter plot. The PO set is analyzed using iSOM PCP



3D CNN Architectures and Attention Mechanisms for Deepfake

13 Jan 2022 The output of Grad-CAM is represented by a heatmap visualization for a given class label in our case deepfake detection. In particular



OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic

Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of large-scale real- scanned 3D databases.



In silico discovery of repetitive elements as key sequence

11 Oct 2023 Investigation of sensitive loci re- vealed known motifs like CTCF



In silico discovery of repetitive elements as key sequence

25 Sept 2023 Investigation of sensitive loci re- vealed known motifs like CTCF



Simultaneous mapping of 3D structure and nascent RNAs argues

29 Nov 2022 were plotted as violin plots using the python plotting package seaborn. Locus enrichment scores. To map the genome-wide localization profiles ...



Subpixel Heatmap Regression for Facial Landmark Localization

3 Nov 2021 [3] Adrian Bulat and Georgios Tzimiropoulos. Two-stage convolutional part heatmap re- gression for the 1st 3d face alignment in the wild (3dfaw) ...



Dynamic 3D proteomes reveal protein functional alterations at high

Dec 23 2020 (C) Heat map of GO biological processes enriched among significantly changed proteins in yeast subjected ... Seaborn library for python v.





Unsupervised Machine Learning Approach for Identifying

Sep 23 2021 Protein-ligand 3D models were collected from the PDBBind database ... and heatmap projections were performed using Python and the ...



Heatmap Perception Study

Apr 29 2022 Impossibility of accomplishing the 3D part of the project ... Explore possible Python libraries



How Do Planners and Citizens Obtain Relevant Information from

Jan 24 2021 tions than heatmaps. The grid size argument of the KDE function from Python's Seaborn package determines the level of smoothness of the map ...



A method to predict 3D structural ensembles of chromatins from

Mar 14 2022 determined the 3D coordinates of a number of loci in a chromosome. Here



Principles of 3D compartmentalization of the human genome

Jun 29 2021 diagonal of Hi-C heatmaps and ''compartment'' to refer to all of the interactions among compartmental domains in the same.



UNIVERSIDAD POLITÉCNICA DE MADRID GRADO EN

3D Slicer es un software libre para el análisis y la visualización de imágenes médicas. obtienen con el paquete Seaborn de Python y para facilitar su ...



Reconstructing the 3D genome organization of Neanderthals

Feb 8 2022 inferring 3D genome organization from DNA sequence to Neanderthal



Chapter 4 Measures of distance between samples: Euclidean

We will be talking a lot about distances in this book. The concept of distance between two samples or between two variables is fundamental in multivariate 

UNIVERSIDADAUTÓNOMADEMADRID

ESCUELA

POLITÉCNICA

SUPERIOR

Degree

in

Computer

science

DEGREE

WORK scikit-fda:

InteractiveVisualizationandAnalysis

Tools for

Functional

Data

Author:

`lvaroSAEnchezRomero

Advisor:

AlbertoSuAErezGonzAElez

May 2021

Allrightsreserved.

No (except may

Francisco

TomásyValiente,n

o 1

Madrid,

28049
Spain `lvaro

SAEnchezRomero

scikit-fda: `lvaro

SAEnchezRomero

C\

FranciscoTomásyValienteN"11

PRINTED

INSPAIN

Necesitamos

enseaearaqueladudano sea temida,sinobienvenidaydebatida. No hayproblemaendecir:"NolosØ".

Richard

Feynman

Prefacio

Este TrabajodeFinde Grado hasido creadocon elpropósitodee xtenderuna delas primer as

librerías dedatos funcionalesen Python,scikit-fda. Estalibrería estAEinteg rada enel ecosistemaSciKit

que incluyepaquetesPython para matemAEticas, cienciaeingeniería.En concreto,sehan diseaeado e implementadoherr amientasinteractivas paralavisualizaciónyanAElisis dedatosfuncionales. Los datos funcionalescorresponden acur vas ,superficies,etc.cuyovalor dependedeunparAEmetro con tinuo.Par asuanálisispartimosde unam uestra compuestaporunconjunto defunciones ,que pueden ser consideradascomouna realizaciónindependiente deun procesoestocástico .Debido aestas car

acterísticas esnecesar iodesarrollarmétodosde análisisestadístico yvisualización máscomplejos

que losutilizados enla estadísticam ultivar iante.

Durantesu desarollose hatr atadode facilitar almáximoeluso deesta partedela librería,g racias a

la estandarización,homogenizacióny reestructur aciónde lafuncionalidaddevisualización.Adicional

mente,se hancompletado ye xtendidoherr amientasde visualizaciónyanálisisfundamentales parala

caracterizacióndelos datosfuncionales .P orúltimo ,sehadotado deinteractividada losgrácos, que

hacen posiblelae xploración dinámicadelosdatos.Gr aciasa estafuncionalidad elusuar iodisponede herramientasa vanzadasparaextr aerelmáximode informacióndelosdatos . Como autordel trabajo ,esperoqueestedocumentoles seaentretenido ylúdico ,tal ycomo me sucedió amí durante eldesarrollodelmismo . v

Agradecimientos

Quiero darg raciasatodoelg rupo detr abajoque conformaestegr anpro yecto,scikit-fda. ACar los Ramos,por sug ran ayudatantoconlalibrería comoconelm undode losdatos funcionales. Ami tutor, AlbertoSuAErez, porsu constantetr abajoy disposicióna orientarmeal seleccionarel proy ecto. Amis compaaeeros,Elena Petr uninayPedroMartín, queme hanacompaaeadovirtualmente alolargode la realización delpro yecto.TambiØnquierodarg raciasalprof esorado dela universidad,loscualesme han ayudadoallegar dondeesto y, especialmentea EloyAnguiano,porsu plantillapar ala realización de estedocumento .Poroeltimo ,quierodarlas graciasa mif amiliapor apoy armesiemprequelohe necesitado. vii

Resumen

En estetr abajosehandesarrollado herramientas interactiv aspar alavisualizacióny anAElisisde

datos funcionalesen lalibrería sci it-{da[1].La libreríascikit-fda para elanAElisisdedatos funcionales

estAE integradaenSciPy [2],un ecosistemade códigoabier topar amatemáticas,ciencia eingeniería,

desarrollado enPython. ElAnálisis deDatos Funcionaleses lar amade laestadística queestudia variablesaleatoriasque dependendeunpar ametrocontin uo;es decir, funcionesaleator iascomo curvas,supercies,etc. Estecampoporejemplo podríaestudiar elcrecimiento dela altura deuna

persona durantesujuv entud(en estecasoelcontin uosería eltiempo). Par apoder obtenerinformación

grácade estosdatos sehan idodesarrollando grácas queper mitanobtener detallessobre lascurvas , valoresatípicos ,parametrización defunciones,entreotros.

Paraempezar, serealizaráunanálisis comparativ ode laslibrerías devisualización yanálisisde

datos disponibles.Enesteestudio seprestará especialatención alas herramientas interactiv asque proporcionan alusuar io.Conelnde establecer elconte xtoen elque hasidodesarrolladoel proy ecto,

se proporcionaráuna somera descripcióndela estructuray funcionalidadde lalibrería sci it-{da. De-

spués sediscutirá sobrelas principales herramientas devisualizacióndesarrolladas:un detectorde

curvasconf orma atípica,unmétodoparapar ametrizar funcioneso ung rácoquepermite comparar un conjuntode datoscon dosdistr ibuciones .Adicionalmente ,tambiénsehaimplementadounmódulo que permiteinteraccionar contodaslasg rácasmediante widgetso conel propiocursor.P ara esto ha sidonecesar ioinvestigar lasmejoresymásecacessoluciones queper mitanque laherr amienta funcione entodas lasinterf acesg rácasdeusuario (GUI). Tambiénel documentoe xplicarálas etapasdedesarrollosoftw areseguidas para conseguirdiseñar e implementarel software .Sehablarásobrela documentación,tipos detestymecanismos detr abajo utilizados.GitHub esla herramienta seleccionadapar acompartirlos progresos ydondeestádisponib le el código[ 3 Mis objetivoscomotr abajode ndegr adoson losde logr arcrearunconjuntodeherr amientasde

visualización quea yudenalosusuar ios, ademásde crearunainterfazmásfácil, estándary homogénea

paralos métodosy ae xistentesylosn uevos.

Palabrasclave

AnAElisis deDatos Funcionales, visualización,interactividad,Python, Matplotlib, scikit-fda(Paquete

de Pythonpar aFDA),medidas deprofundidad,software decódigo abierto ix

Abstract

In thisw orkinteractivetools havebeendev elopedf orthevisualizationandanalysisof functional

data inthe sci it-{dalibrary[1].The scikit-fdalibr aryfortheanalysis offunctionaldatais integr atedin

SciPy [

2 ],an open-sourceecosystem for mathematics,scienceand engineering,dev elopedin Python. Functional DataAnalysis (FDA) isthebranch ofstatistics thatstudies randomvar iables thatdepend on a continuousparameter ;orwhatisthesame ,r andomfunctions like curv esorsurfaces .FDAcould for examplestudy theg rowth ontheheightofaperson during hisy outh(in thiscase beingthe continuum time). Tobeab leto obtaingraphical infor mationfrom thisdata,newgr aphicshadbeingdeveloped to obtain detailsof curv es,atypicalvalues,parameter izationof functionsandothers. Tostar t,acomparativ eanalysis ofthevisualizationanddataanalysis librar iesa vailab lewill be done.Dur ingthisstudyattention willbe paidto theinter active toolsthat aregiv ento theuser .With the objectiveofestab lishingthe contextinwhich theproject hasbeendev eloped,a brief description of the structureandfunctionality ofsci it-{dawill begiv en.Afterthat,there willbe adiscussion aboutthe main visualizationtools dev eloped:ashapeoutlierdetector, amethod topar ameteriz efunctionsanda graphicthatallo wsto compareasampleof datawith two distrib utions. Besidesthat, Iha ve alsowork ed in amodule thatallo wsy outointeract withallthediff erentg raphicsthanksto widgetsor withy ouro wn mouse.F orthisithas beennecessar yto inv estigatethe bestsolutionsandthe mostefficient onesthat allowedtheplots tow ork ine verygraphical userinterf ace(GUI). Furthermore,thisdocumentwill talkabout thesoftw aresteps thathad tobe follo wedtomanage to design andimplement theproject. Itwill commenton thedocumentation, typesof testsused andw ork mechanisms followed.GitHubisthetool usedto shareour progress andwhere thecode iscurrently availableforeverybody[3]. My objectiveswiththis degree wor karecreatinga setofvisualizationmethodsthathelp theusers , besides creatingan easier, morestandardandhomogeneous interface for thealready existing methods and thene wones. K eywords Functional DataAnalysis ,visualization,interactivity ,Python, Matplotlib,scikit-fda(Pythonpackage forFD A),depthmeasures, open-sourcesoftw are xi

TableofContents

1 Introduction1

1.1Goals andscope . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 2

1.2Document structure. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 3

2 Stateof thear t5

2.1Visualization software. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 5

2.1.1Plotly. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .5

2.1.2Ggplot2. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .6

2.1.3Matplotlib. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .7

2.1.4Seaborn. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 8

2.2Functional DataAnalysis . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 9

2.3scikit-fda. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .10

2.4Visualization andtools for functionaldataanalysis. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 11

2.4.1Depth measures. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 11

2.4.1.1IntegratedDepth. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 11

2.4.1.2Band Depth. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 12

2.4.1.3Modied BandDepth . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 12

2.4.1.4Modied EpigraphIndex. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 12

2.4.2DD Plot. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .13

2.4.3ParametricPlot. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .13

2.4.4Outliergram. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 14

2.4.4.1Relationship betweenMBDand MEI. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .15

2.4.4.2Shape outlierdetection . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .15

3 Softwarede velopment17

3.1Analysis. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 17

3.2Design. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .20

3.2.1BasePlot Class. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .21

3.2.2OutliergramClass. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 21

3.2.3DDPlot Class. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 22

3.2.4ParametricPlotClass. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 22

3.2.5GraphPlotClass . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 22

3.2.6Rest ofplotting classes. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 23

3.2.7MultipleDisplayclass . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 23

xiii

3.3Implementation. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 24

3.4Testing. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .25

3.5Integration. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 26

3.6Licenses. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .26

4 Results27

4.1Outliergram. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .27

4.2DD Plot. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 28

4.3ParametricPlot. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 31

4.4GraphPlot withg radient ofcolors. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 33

4.5Multiple Display. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .34

5 Conclusionsandfuture work 37

Bibliography41

Appendices43

A Toolsused45

A.1Wemake-python-styleguide. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 45

A.2Testing. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .46

A.3Documentation. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 46

A.4GitHub. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .47

B Usabilitytest 49

C GanttChar t51

D Notebooks53

xiv Lists

List offigures

2.1Grammarof gr aphics. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .7

2.2Stem plot. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .8

2.3Modules scikit-fda. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 10

2.4Parametricplotexample . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .14

2.5Curveswithdiff erentshape . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 16

3.1Use casediag ram. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 19

3.2Class diagramofthevisualization module. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .20

3.3Colormape xample. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .23

4.1Canadian Weatherdataset(temper atures). .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .27

4.2Outliergramofthe CanadianW eatherdataset . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 28

4.3Representation ofAtlantic ,ContinentalandP acictemper aturefunctions . .. .. .. .. .. .29

4.4DDPlot rste xperiment. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .29

4.5DDPlot seconde xperiment. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 30

4.6DDPlot thirde xperiment. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .30

4.7Parameterizationofword fda. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .31

4.8Parameterizationofderiv atives (2). .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 31

4.9Parameterizationofgaitcycle . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 32

4.10Parameterizationofgaitcycle . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 32

4.11Visualization oftemper aturecurves withgradientofcolors thanksto depths. .. .. .. .. .33

4.12Visualization oftemper aturecurves withgradientofcolors thanksto MEI. .. .. .. .. .. .33

4.13Basic MultipleDispla y. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .34

4.14Multiple Displayclick ed. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .34

4.15Multiple Displaywithwidgets . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .35

4.16Hoveringexample. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 36

C.1Gantt Chart1. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 51

C.2Gantt Chart2. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 52

C.3Gantt Chart3. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. 52

xv xvi 1 I ntroduction Overthe years ,newfieldsinstatistics hadbeengro wing.One ofthem isFunctional DataAnalysis (FDA).This area"s objectofstudyare dataconsisting ofcur ves ,surf acesor any quantitythat changes overacontin uum.An exampleofthese typesof dataisacollection offunctions thatdepend onv ariab les such astime orspace .Ev enifitsbeginnings areinthe1940s ,core advances were madein recent time thanksto Ramsay andSilverman [4] andF erratyandVieu[5]. Theseare someof thefundamental referencesof thisproject asthe yha ve helpedmetounderstand themostimportantconceptsof FDA. One ofthe interestingaspects ofthis branch ofstatistics ,is themultipleareas itcan beapplied to,lik enancialtime-series ,biomedical signal,climatepatterns,and soon. Asinterest gro ws,more frameworksappearrelatedwiththis topic.Most ofthese arecoded inthe progr amminglanguage ,R; evenifw ecan ndsomeinMatlab .In thislanguage we cannd multiplepackages suchas fda.usc [6], fda [ 7 ] (whichwasalso dev elopedbyRamsay),fdasrvf[ 8] andothers .Some ofthese librar ieshave a moregener alobjectivewhile othersaremorespecializ edin certain functionalitieslik ereg ression, classication orcluster ing.Almostallthe packages inR relatedwith thismatter,are storedin the repositoryCRAN, TheComprehensiv eR ArchiveNetwork[9]. A functionaldataset canbe ver ysimple torepresentgraphically. For example ,a setofcurvesin a 2dimensional space. Therepresentationofsurf acesor higherdimensional-quantities posesmore difculties.F orthisreason,it isimpor tantto hav ea goodvisualizationmodulethat allows userscarry out avisual explor ationthatincreasestheirunderstandingof thedata. Theobjectiv eis tocreate a simple andhomogeneous interface fortheuser inwhichhecan visualize easilyhis plot. In thisproject Icollabor ateon thedevelopment ofthe visualizationand interactivitymoduleofaFD A packagein Python,sci it-{da[1]. Thispac kageisan open-sourcesoftw areprojectstarted in2018 and in whichman ypeoplehav econtributed[10].After that,it hasbeen continued by otherstudentsand researchers ofthe MachineLear ningGroup attheDepartment ofComputer Scienceof theUniv ersidad Autónomade Madrid (UAM)withthe objectiveofcreating oneof therst Pythonlibraries relatedwith FDA.Some ofthe reasonsf orthe creationof thepackageare theincrease andhigh utilizationof Python in theareas ofstatistics andmachine learning during thelast yearsandg rowthofinterest inFD A.This softwareis expected tohopefullybeuseful for many Pythonusers wor ldwide,whichwill havem ultiple

Introduction

functional datatools av ailablefortheirscientificprojects.

1.1Goals andscope

This projectis focused onthedev elopmentof visualizationand analysistools,expandingthe wor kdone byother studentsin thisarea, like Amanda Hernando[11] thatde velopednewdepth measuresand clusteringmethods .Theprojectgoal isalso todo acomprehensiv eredesign ofthe visualizationtools . These havebeengrouped ina moduleandorganized ina hierarchical structure .The interfacehas been homogenized,standardized andcompletedincludinginter active functionalityto compareg raphics and getinsights ofthem. To dothis Ihadtostudy amongall themost important visualizationpac kages to seewhich onefullled betterthe requirementsour projectw asdemanding. The newgr aphsdevelopedarethe Outliergram,whichisf ocusedon detectingshape outliers,the DD-Plot, usedto comparea datasetwith different distrib utionsand theParametr icPlot,whichallowsus to showtwo functionsascoordinates. Moreov er, togivemore informationwhenrepresentingsamplesI added newfeatures totheexisting plottingmethod. Thesene wfunctionsgive thelibr ary awider range of functionsto view dataandplots. One problemthatthe librar yw asfacingisthenon homogeneityofthevisualization methods.Itma y result difcultf ortheusersto understandho wto useall ofthem, soIdecidedto give thema common interfaceto make themallwor kin thesame way.Withthe samepur pose, itw asstudiedthebestways to standardizethecode imitatingother scienticlibr aries ,so theusers aremorecomfortablewithour software.Withthese changesthe projectsucceeds inthe goalof makingthe librar yeasier for theuser , something thatis reallyimpor tantdue tothecomplexity ofthe librar yand FunctionalDataAnalysisas not necessarilytheuser shouldbe ane xpert ofit. Fora betterunderstanding ofthe functions, aninter active modulewasadded allowing theuser to combineg raphsandcomparedifferent representationsof thesame datasetindifferent plots. This providesusers am uchmore comfortable ande xibleexperience. One ofthe prior itiesoftheprojectwas alsoto dev elophigh qualitysoftware, understandable for any user familiarwithlibr aries inthePythonscienticecosystem.F orthis ,besides thealready mentioned, I pursuedthe objective ofcommentingandtyping ourcode intensively .Online notebooksw erecreated as examples(av ailableatthewebsiteof theproject [12]) ofnewfunctionality addedto theproject. Finally Ialso tookinto accountho ww ellinteg ratedwas thecodewithdiff erentindependentplatforms as Jupyter,andhowthe new interactivitywas functioning.Besides ,thecodeis testedforWindo ws,Mac and Ubuntu,andcompatib lewith Python3.7and3.8 versions .Our maingoal, was tomak eitworkf or everydifferentPython userindependentlyofthe back endthe yuse . scikit-fda: InteractiveVisualizationandAnalysis Tools for FunctionalData 2

1.2.Documentstructure

1.2Document structure

The documentis organized inthefollo wingchapters:

Chapter2, exploresthestate ofar tof thearea, byproviding are view ofadv ancedtoolsfor vi- sual dataanalysis andstudying thelatest technologiesand algorithms implemented.First, an analysis ofthe mostadv ancedlibr ariesfor creatingbothstaticandinteractiv evisualizations is done,revising thefunctionalitythey provide ,their limitationsandpositive qualities.After wards,FunctionalData Analysisand thepac kagecontaining thene wfunctionality developed in theproject (scikit-fda)are explored, explaining itsmainmodulesand purposes.Finally ,I describethe mainvisualization methodsde veloped, theirf ormulasandwhatinf ormation these graphicsgive us. Chapter3,describesthe software developmentprocess followedtode velop theproject.It describesall thesteps made(analysis ,design, implementationand testing)andthemain decisions takenineach one. Chapter4showsthe illustrations forallthe functionalityimplementedduring theproject, from the visualizationmethods tothe interactiv eg raphics. Finally,chapter 5ontains asummar yofthecontr ibutions madein thisprojectandsome conclusions.Diff erentpossibilitiesfor futurede velopmentsaree xploredas well.Finally,Iwill providem ypersonalassessmentof theproject, includinga revie wof theacquired skillsthat I haveappliedand thene wones thatI havelear ned.

ÁlvaroSánchez Romero

3 2 S tate oftheart This chaptere xploressomeadvanced staticand interactivetools for visualdata analysis,aswellas the visualizationfunctions dev elopedintheproject.First,a revie wof software packages for staticand interactivevisualizationis made. Theirfunctionality ,advantagesand disadvantages willbediscussed. The studyf ocusesonthoseaspects thatare relev antto thecurrent project.Special attentionwillbe devotedtoMatplotlib ,which isoneofthe mostcommon visualizationpac kagesin Python,and isused extensivelyinour project.The mostsalient aspectsof itsApplication Progr ammingInterf ace(API) are highlighted asit helpsto understandsome designdecisions explained during thef ollowing chapter. Then, thescikit-fda librar y,itsdifferentmodulesandthe functionalitypro videdbyeach moduleare explained.Finally ,anexplor ationof thevisualizationtoolsdeveloped inthe projectis done, explaining the basicfunctional datanotions thatsustain thenecessity for them,ho wthe yw orkorwhat canthe y do.These toolsinclude thee xpansionof othersalready implemented,likethe curv esand surfaces visualization orne woneslike theOutlierg ram,DDPlot,etc.

2.1Visualization software

In thissection, themost advanced visualizationpac kagesnowada ysare explored,explaining their characteristics,howw elltheyfit intoourprojectand whichonewe willbe using(Matplotlib).

2.1.1Plotly

Plotly [

13 ] isone ofthe mostadv ancedopen-source visualizationlibraries dev elopedin Python.De- spite this,itis alsocompatib lewith otherm ultipleprogramming languagessuch asR orMATLAB. A basic versionofthis librar yis distributedundertheunder theMIT license(X11license),whichgives permissionf orbothpriv ateand commercialuse,modication.Further more, itis compatiblewithother licenses.Additional functionalityintended for largecompanies ,suchasChar tStudio Enterpriseor

Dash Enterprise,isprovidedat acost.

This toolkitis web-based, canbeaccessedb ym ultipleGUIs like Jupyterand providesane xtense

Stateoftheart

varietyofg raphs .Plotlyfigurescanbeexported toa staticfile like anSV G,PNG,PDFortoHTML formatthatcan bedispla yed inw ebsites.ThisHTMLfor matincor poratesinteractivevisualization tools, such asho vering,zoomingorshowing andhidingelementsthanks tothe legend.Italsohas the possibility ofadding widgetsto interact withthe figures. The maindisadv antageplotlypresentsis thatit isnot ver ycommon inother scientificlibr aries , havingless dependants. Thisdecreasesthepossibilities ofmaintenance dueto thereis noincentiv e to keepthelibr ary uptodate.Further more, itsuse alsohindersmantaininga similarstructuretoother

SciKit packagesthatcommonly useMatplotlib .

2.1.2 Ggplot2

Ggplot2 [

14 ] isa librarywrittenin Rthatfor mspar tof tidyverse ,whichisacollectionof Rpackages created fordatascience .It isvery populardue tothe useoftheconceptofGr ammarof Graphics[15]. This conceptdivides inla yers thecomponentsofagr aphic, simplifyingthe readabilityof thecode generatedthanks tocreating gr aphb ycombiningthem. Grammarof Graphics isatechniquecreated by LelandWilkinson withthe aimof establishinga standard wayofcreatingan ytype ofg raphicfroman yconte xt,allo wingustodividethenecessar y components thatcreate ag raphic. Asitcanbeseenin Figure2.1,its differentcomponentsf orm a pyramidalstructure dependingonitsimpor tance. Thethree fundamentalpillars usedtodescribea graphicarethe datato beplotted, theaesthetics deningwhat datais wanted tobe display ed(f or examplethe axis)and thegeometr icobject usedto plotourdata(lines ,points ,bars ,etc.). Without this threecomponents itis notpossib leto createa graphic, but theones abov ethemareoptional. The optionalcomponents aref acets(used tocreatesubplots),statistical transf ormations (percentiles, median, etc.),the coordinatesystem andtheme used.The themedescr ibesf eaturesnot relatedwith the datadirectly like legendsorcolors. This layeredframew orkforplotcreationfacilitatesthespecication andanalysisofg raphs .There is animplementation inPython thatmak esuse ofthis framewor k( plotnine). Inpr inciple,itcouldhav e been usedf orthisproject.Ho we ver ,theusageofthelibraryisratherlimited, whichr isesthe possibility of itssuppor tandmaintenancebeing discontinued inthe future. Besides,itis alsoimpor tantthat other scientific librariesuseitto hav ethe moststandard code,makingiteasierf orthe user. One disadvantageofggplot2 itis thatit doesn"tha ve itso wninter active moduleanditiscommonly combined withplotly (Section2.1.1)to createinter activegraphics. OthertoolsinRlikeggiraph hav e appeared topro videalsothisfunctionality toggplot2, givingan alternativ eto plotly. scikit-fda: InteractiveVisualizationandAnalysis Tools for FunctionalData 6

2.1.Visualizationsoftware

Figure 2.1:Main layersoftheGr ammarof Graphics

2.1.3Matplotlib

Matplotlib [

16 ] isone ofone ofthe mostwidely usedopen-source packages for Pythonand itspur pose

is datavisualization. Itis known thanksto itsextensive functionalityand compatibilitywith otherlibr aries

likeNumpy[ 17].It allowstheuser toplot any typeof datafrom curves(plot), points(scatter), surfaces (plot_surface),heatmaps ,etc.Italso hasthe possibilityof creating3D Axes ,which inthis project logically arev eryusefulandallow toplot clearly notonlycurv esb utalso surfacesorstem plots(useful to plota discretesequence ofdata, example inFigure 2.2).Another advantageofMatplotlib isthat other packagesinthe Pythonscientic ecosystemlik escikit-lear n[18]are basedon thislibr ary .Its wide usemeans thatthe librar yis likelytobeingmaintainedand updatedin theshortto mediumter m. Perhaps,otherlibr aries havemoreadvanced functionalitybutaprior ityof theprojectisthe stability of thelibr ariesused,soapopular onelik eMatplotlib provides usthis .Besides ,asinthis projectis pursued thestandardization ofour code, itis agreatoption touse similarcon ventions andAPItoother packagesand tof acilitatethe useofscikit-fda's visualizationtools by progr ammerswhoarealready familiarwith Matplotlib. Anotherinterestingaspect,it thatthanks top yplotit gives theuser aplotting frameworksimilartotheone offeredin MATLAB [19]. Dueto thisreasons,Matplotlib isthe librar yused in ourproject. Besides thediff erentandmultiple representingfunctionalities itoffersalmost atotal controlin style,

visualization settingsthat canresult usefulf orour project.Matplotlib alsohasextr atoolkits like mplot3d

oraxisartists, andv eryusefulthirdparty packages like seaborn2.1.4. Matplotlib usesas itscan vas anobjectoftypeFigure, whichis acollection ofdiff erentAx esor subplots inwhich thedata isrepresented. Themain advanced functionalitiesused tode velop the project andthe functionalitythat allows usto interactwithg raphs are: Matplotlib events:theyare usedto connectan yelement ofour representedfiguretoa `lvaroSAEnchez Romero 7

Stateoftheart

Figure 2.2:Example ofan stemplot createdwith Matplotlib. callbackfunction. Thise vents aretriggeredbya selectedaction suchasthepointer" shover ing an axis,selectinga pointin ascatter plot,mo vingthe mouseand others. For aspecificaction, the correspondingfunctionality isimplemented inthe callback functionassociated toit. Artistc lass:it isan abstract classforall theobjects thatarerendered(Axis ,Figure ,plots , scatterings,etc.).This isfundamental for theinter activity, asthecallback functionscan use its propertiestoedit thetr ansparencyof any sample(set_alpha),modifyits color, plotnew functions oradd annotations. Dependingontherepresentation functionused, thereare differenttypes ofar tistslik eLine2Dobjectswhen plotting,PathCollectionwhen scattering, Poly3DCollectionwhen plottinga surface ,etc. Thesedifferentartistsha ve different extra functionalities.F orexample, inthecaseofPathCollection they canbe selected,which isused forthe interactivity module.Othertype ofArtistthat alsoe xistare Patch objects,thatare commonly shapesor Annotationobjects (labelswith text). Matplotlib widgets:theyallo wustointer actwith gr aphicsbytr iggering aMatplotlibevent when awidget isused. They arerepresented insideanAxes objectand hav ethe advantage that workinev ery GUIbackend.Therearemultiple typesof widgets,likeCheckButton, Slider and TextBox.Itispossible todefine callbacks fortheiractiv ationin orderto modifytheFigure.

2.1.4Seaborn

Seaborn[20] isa Pythonvisualization package basedinMatplotlib. Themost clearadv antageithas, is theupg radeontheprevious gr aphics, obtainingbetterplotswith thesamecode.Anotherfeaturequotesdbs_dbs9.pdfusesText_15
[PDF] 3d histogram python seaborn

[PDF] 3d model images free

[PDF] 3d reconstruction from multiple images software

[PDF] 3d reconstruction from single image

[PDF] 3d reconstruction from video github

[PDF] 3d shape vocabulary words

[PDF] 4 impasse gomboust 75001 paris 1er arrondissement

[PDF] 4 stages of language development pdf

[PDF] 4 tier architecture diagram

[PDF] 40 prepositions list

[PDF] 403 your not allowed nsclient

[PDF] 46 quai alphonse le gallo 92100 boulogne billancourt paris

[PDF] 4d embroidery system software download

[PDF] 4d systems touch screen arduino

[PDF] 4th edition pdf