Categories
Linux Python

Massive Nextcloud log file quickly analysed using Python

I ran into a problem with quite a buggy Nextcloud instance on a host with limited quota. The Nextcloud log file would baloon at a crazy rate. So at one point, I snatched a 700 MB sample (yeah, that took maybe an hour or so) and wondered: what’s wrong?

So, first things first: Nextcloud’s log files are JSON files. Which makes them excruciatingly difficult to read. Okay, better than binary, but still, not an eye pleaser. They wouldn’t be easy to grep either. So, Python to the rescue as it has the json module*.

First, using head I looked at the first 10 lines only. Why? Because I had no idea of the performance of this little script of mine and I wanted to check it out first.

head -n 10 nextcloud.log > nextcloud.log.10

Because these logs are scattered with user and directory names and specifics of that particular Nextcloud instance (it’ll be NC from here on), I won’t share any of them here. Sorry. But if you have NC yourself, just get it from the /data/ directory of your NC instance.

I found each line to contain one JSON object (enclosed in curly brackets). So, let’s read this line-by-line and feed it into Python’s JSON parser:

import json

with open("nextcloud.log.10", "r") as fh:
    for line in fh:
        data = json.loads(line)

At this point, you can already get an idea of how long each line is processed. If you’re using Jupyter Notebook, you can place the with statement into its own cell and simply use the %%timeit cell magic for a good first impression. On my machine it says

592 µs ± 7.65 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

which is okay: roughly 60 µs per line.

Next, I wanted to inspect a few lines and make reading easier: pretty print, or pprint as its module is called, to the rescue!

from pprint import pprint

pprint(data)

This pretty prints the last line. If you want to access all 10 lines, create for instance an empty array data_lines first and do data_lines.append(data) inside the for loop.

{'reqId': '<redacted>',
 'level': 2,
 'time': '2025-02-06<redacted>',
 'remoteAddr': '<redacted>',
 'user': '<redacted>',
 'app': 'no app in context',
 'method': 'GET',
 'url': '/<redacted>/apps/user_status/api/<redacted>?format=json',
 'message': 'Temporary directory /www/htdocs/<redacted>/tmp/ is not present or writable',
 'userAgent': 'Mozilla/5.0 (Linux) <redacted> (Nextcloud, <redacted>)',
 'version': '<redacted>',
 'data': []}

Okay, there is a message which might be interesting, but I found another one:

{'reqId': '',
'level': 0,
'time': '2025-02-06T',
'remoteAddr': '',
'user': '',
'app': 'no app in context',
'method': 'PROPFIND',
'url': '//',
'message': 'Calling without parameters is deprecated and will throw soon.',
'userAgent': 'Mozilla/5.0 (Linux) (Nextcloud, 4)',
'version': '',
'exception': {'Exception': 'Exception',
   'Message': 'No parameters in call to ',
    …

Now, this is much more interesting: It contains a key exception with a message and a long traceback below.

I simply want to know:

  • How many of these exceptions are there?
  • How many unique messages are there?

In other words: Is this a clusterfuck, or can I get this thing silent by fixing a handful of things?

So, the idea is simple:

  1. Read each line.
  2. Check if the line contains an exception keyword.
  3. In that case, count it and…
  4. … append the corresponding message to a list.
  5. Finally, convert that list into a set.

And here is how this looks in Python:

import json
from pprint import pprint

lines = 0
exceptions = 0
ex_messages = []

with open("nextcloud.log", "r") as fh:
    for line in fh:
        lines += 1
        data = json.loads(line)
        
        if "exception" in data.keys():
            exceptions += 1
            msg = data["exception"]["Message"]
            ex_messages.append(msg)

print(f"{lines:d} read, {exceptions:d} exceptions.")

s_ex_msg = set(ex_messages)
print(f"{len(s_ex_msg):d} unique message types.")

pprint(s_ex_msg)

I had

37460 read, 32537 exceptions.
22 unique message types.

That’s a lot of exceptions but a surprisingly small number of unique messages, i.e. possible individual causes.

In my case, it mainly showed me what I knew beforehand: The database was a total mess.

But see what you find.

Exercise: See how you need to modify the script to count how many out of the 32537 exceptions correspond to each of the 22 unique messages. And toot about it.

*) I wonder if people will come and propose to use simplejson, as I’ve read in the wild, because “it’s faster!!!”. Use %%timeit to find out. Anything else is Mumpitz (forum voodoo).

Categories
Embedded Engineering Linux Python

Red Pitaya using only pyVISA

The Red Pitaya boards offer an SCPI server over an TCP/IP Socket connection. The makers describe how to use it. But instead of using plain pyVISA, they provide their own SCPI class.

That’s fine, because that class also provides handy functions to set the various in-built applications (signal generator and the likes).

But it is unnecessary complicated for a blinky example. And in my case, where I only needed some scriptable DIOs, it was quite cumbersome.

So, here is the blinky re-written in plain pyVISA:

import pyvisa as visa
from time import sleep

rm = visa.ResourceManager()
rp = rm.open_resource("TCPIP::169.254.XXX.XXX::5000::SOCKET",
                 read_termination="\r\n",
                 write_termination="\r\n"
                 )

print(rp.query("*IDN?"))

while True:
    rp.write("DIG:PIN LED0,1")
    sleep(.5)
    rp.write("DIG:PIN LED0,0")
    sleep(.5)

The magic lies in the read and write terminations. They have to be set to '\r\n'(in that order), or else the communication simply won’t work and time out.

Make sure you install a reasonably recent pyVISA and pyVISA-py (from pip) or libvisa (from your distro’s repository) before you start. For me (Ubuntu) this works as follows:

pip install -U pyvisa pyvisa-py
sudo apt install libvisa

This integrates nicely with existing instrument command structures and allows for quick testing.

Categories
Uncategorized

Got a R&S FSH4 and want to extract screenshots?

Situation: I’ve taken about 50ish measurements on an FSH4 spectrum analyser saving a dataset of each using the save data function of the spectrum analyser.

Turns out: The R&S InstrumentView application does not install on Wine. So, how do I get the screenshots embedded in the saved datasets? Right: Search for the PNG magic number and a chunk end.

This is really sketchy but worked on all measurement files. If you’re in the same situation, just check out my repository.

https://gitlab.com/cweickhmann/extract-png-from-rs-set

Note: No affiliation with R&S whatsoever. Please don’t pester them if the script is not working.

Categories
Finite Elements

Breaking Changes 2: Pygmsh, Fenics and Meshio>=4.0.0

Package versions this was tested with (2020-12-22):
gmsh 4.7.1
fenics 2019.1.0:latest
meshio 4.3.7
pygmsh 7.1.5

So, I thought, let’s do the article on Meshio>=4.0.0 and Fenics and show how interchangeable the gmsh itself, gmsh Python API, and pygmsh are. Well, I seem to not get it, but at least I did not manage to show that.

Using pygmsh with physical labels and Fenics is a bit unclear to me. I got it to work eventually, so here are the code blocks.

Generate the mesh using pygmsh

import pygmsh
# OpenCascade in pygmsh seems not to support extraction of lines from a rectangle (... to use with physical labels).
# So, let's use the geo kernel:
with pygmsh.geo.Geometry() as geom:
    r1 = geom.add_rectangle(0., 5e-3, 0., 2.5e-3, z=0.)
    geom.add_physical(r1.lines, label="1")
    geom.add_physical(r1.surface, label="2")

mesh = geom.generate_mesh(dim=2)
# We'll use gmsh format version 2.2 here, as there's a problem
# with writing nodes in the format version 4.1 here, that I cannot figure out
mesh.write("test.msh", file_format="gmsh22")

So, here’s the first oddity I would not get my head around: There seems to be no easy way to access the boundaries of the rectangle generated with the OpenCASCADE kernel. In gmsh’s API they were there as the first four one-dimensional items (although without the tutorial file there would have been no way I could have guessed that).

The second problem was writing the generated mesh to a gmsh format version 4.1, which resulted in an error message I could not quite track back:

>>> mesh.write("test.msh", file_format="gmsh") # That's gmsh41

---------------------------------------------------------------------------
WriteError                                Traceback (most recent call last)
  in 
      13 # with writing nodes in the format version 4.1 here, that I cannot
      14 # figure out
 ---> 15 mesh.write("test.msh", file_format="gmsh")
 /usr/local/lib/python3.6/dist-packages/meshio/_mesh.py in write(self, path_or_buf, file_format, **kwargs)
     158         from ._helpers import write
     159 
 --> 160         write(path_or_buf, self, file_format, **kwargs)
     161 
     162     def get_cells_type(self, cell_type):
 /usr/local/lib/python3.6/dist-packages/meshio/_helpers.py in write(filename, mesh, file_format, **kwargs)
     144 
     145     # Write
 --> 146     return writer(filename, mesh, **kwargs)
 /usr/local/lib/python3.6/dist-packages/meshio/gmsh/main.py in (f, m, **kwargs)
     109     {
     110         "gmsh22": lambda f, m, **kwargs: write(f, m, "2.2", **kwargs),
 --> 111         "gmsh": lambda f, m, **kwargs: write(f, m, "4.1", **kwargs),
     112     },
     113 )
 /usr/local/lib/python3.6/dist-packages/meshio/gmsh/main.py in write(filename, mesh, fmt_version, binary, float_fmt)
     100             )
     101 
 --> 102     writer.write(filename, mesh, binary=binary, float_fmt=float_fmt)
     103 
     104 
 /usr/local/lib/python3.6/dist-packages/meshio/gmsh/_gmsh41.py in write(filename, mesh, float_fmt, binary)
     356 
     357         _write_entities(fh, cells, tag_data, mesh.cell_sets, mesh.point_data, binary)
 --> 358         _write_nodes(fh, mesh.points, mesh.cells, mesh.point_data, float_fmt, binary)
     359         _write_elements(fh, cells, tag_data, binary)
     360         if mesh.gmsh_periodic is not None:
 /usr/local/lib/python3.6/dist-packages/meshio/gmsh/_gmsh41.py in _write_nodes(fh, points, cells, point_data, float_fmt, binary)
     609         if len(cells) != 1:
     610             raise WriteError(
 --> 611                 "Specify entity information to deal with more than one cell type"
     612             )
     613 
 WriteError: Specify entity information to deal with more than one cell type

Preparing mesh and boundary files for Fenics

Falling back to gmsh format version 2.2, I could generate the mesh and boundary files like in the original post:

outfile_mesh = f"{prefix:s}_mesh.xdmf"
outfile_boundary = f"{prefix:s}_boundaries.xdmf"

# read input from infile
inmsh = meshio.read(f"{prefix:s}.msh")
# delete third (obj=2) column (axis=1), this strips the z-component
outpoints = np.delete(arr=inmsh.points, obj=2, axis=1)

# create (two dimensional!) triangle mesh file
outmsh = meshio.Mesh(points=outpoints,
                      cells=[('triangle', inmsh.get_cells_type("triangle"))],
                      cell_data={'Subdomain': [inmsh.cell_data_dict['gmsh:physical']['triangle']]},
                      field_data=inmsh.field_data)

# write mesh to file
meshio.write(outfile_mesh, outmsh)
# create (two dimensional!) boundary data file
outboundary = meshio.Mesh(points=outpoints,
                           cells=[('line', inmsh.get_cells_type('line') )],
                           cell_data={'Boundary': [inmsh.cell_data_dict['gmsh:physical']['line']]},
                           field_data=inmsh.field_data)
# write boundary data to file
meshio.write(filename=outfile_boundary, mesh=outboundary)

Just to figure out that while pygmsh allows you to assign a string label to physical groups it numbers them automatically (apparently starting at 0). This is fine, it just seems to be written nowhere to be found.

Modifying the original code at the definition of the Dirichlet Boundary Condition did the trick (full listing at the end):

# ...

bcs = [
     do.DirichletBC(FS, do.Constant((0.0, 0.0)), outerwall, 0), # Choose physical group "index" zero here.
     ]

# ...

Conclusion

While this arguably still works and does the job it took me a few tries to figure out how. So, I hope this helps others at some point.

Full listing of the Fenics part

import dolfin as do
import numpy as np
import matplotlib.pyplot as plt

# Import Mesh
mesh = do.Mesh()
with do.XDMFFile(outfile_mesh) as meshfile, \
        do.XDMFFile(outfile_boundary) as boundaryfile:
    meshfile.read(mesh)
    mvc = do.MeshValueCollection("size_t", mesh, 2)
    boundaryfile.read(mvc, "Boundary")
    outerwall = do.MeshFunction("size_t", mesh, mvc)

do.plot(mesh); plt.show()

# Generate Function Space
FE = do.FiniteElement("RTE", mesh.ufl_cell(), 3)
FS = do.FunctionSpace(mesh, FE)

# Use markers for boundary conditions (watch for the change! ;-) )
bcs = [
    do.DirichletBC(FS, do.Constant((0.0, 0.0)), outerwall, 0), # Choose physical group "index" zero here.
    ]

# Trial and Test functions
E = do.TrialFunction(FS)
EE = do.TestFunction(FS)

# Helmholtz EVP (Ae = -k_co^2B*e)
a = do.inner(do.curl(E), do.curl(EE))*do.dx
b = do.inner(E, EE)*do.dx

# For EVP make use of PETSc and SLEPc
dummy = E[0]*do.dx
A = do.PETScMatrix()
B = do.PETScMatrix()

# Assemble System
do.assemble_system(a, dummy, bcs, A_tensor=A)
do.assemble_system(b, dummy, bcs, A_tensor=B)

# Apply Boundaries
[bc.apply(B) for bc in bcs]
[bc.apply(A) for bc in bcs]

# Let SLEPc solve that generalised EVP
solver = do.SLEPcEigenSolver(A, B)
solver.parameters["solver"] = "krylov-schur"
solver.parameters["tolerance"] = 1e-16
solver.parameters["problem_type"] = "gen_hermitian"
solver.parameters["spectrum"] = "target magnitude"
solver.parameters["spectral_transform"] = "shift-and-invert"
solver.parameters["spectral_shift"] = -(2.np.pi/10e-3)*2

neigs = 20
solver.solve(neigs)
print(f"Found {solver.get_number_converged():d} solutions.")

# Return the computed eigenvalues in a sorted array
computed_eigenvalues = []

for i in range(min(neigs, solver.get_number_converged())):
    r, _, fieldRe, fieldIm = solver.get_eigenpair(i) # ignore the imaginary part
    f = do.Function(FS)
    f.vector()[:] = fieldRe
    if np.abs(r) > 1.1:
        # With r = -gamma^2, find gamma = sqrt(-r)
        gamma = np.sqrt(r)
        do.plot(f); plt.title(f"{gamma:f}"); plt.show()
    computed_eigenvalues.append(r)

print(np.sort(np.array(computed_eigenvalues)))

Categories
Finite Elements

Breaking Changes: Fenics and Meshio>=4.0.0

Package versions this was tested with (2020-12-22):
gmsh 4.7.1
fenics 2019.1.0:latest
meshio 4.3.7
pygmsh 7.1.5

Check your versions with the code provided here.

Meshio is a great piece of software if you’re into converting meshes, or for that matter, if you’re using Fenics for finite elements calculations. However, the problem I experienced with both packages is their rapid development, consequently breaking changes and sometimes a lack of up-to-date documentation.

So, the Fenics community decided to embrace XDMF as mesh format of choice (good idea) but as usual with new stuff it’s WIP. So, there’s an ongoing thread on the discussion group which is extensive, but frankly hard to follow. After a day of fiddeling, I realised that a lot of example code discussed in the thread broke with Meshio version 4.0.0 and newer. It’s a subtle change in Meshio, which makes it non-obvious.

So, with no futher ado, here are code blocks that

  1. Generate a simple mesh using Gmsh
    1. A geo file
    2. A code block using Gmsh’s python API
  2. Convert msh-file to two XDMF files (mesh and boundary markers) for Fenics to digest
  3. Import of the XDMF files into Fenics and a minimal use-case

Generate a simple mesh using Gmsh

The three methods shown below are equivalent. Which one to choose depends on your workflow. It should be noted that getting the Gmsh Python API is not hard but still a bit tricky depending on context.

The goal is to build a rectangle of 5 mm × 2.5 mm with two physical groups (the boundary and the inner area).

Let’s start with the most static but straight-forward aproach: The geo-file straight from Gmsh’s UI. We’ll use the OpenCASCADE kernel (because, why not?):

SetFactory("OpenCASCADE");
Rectangle(1) = {-1.3, 0.5, 0, 5e-3, 2.5e-3, 0};
Physical Surface(2) = {1};
Physical Curve(1) = {1,2,3,4};

Save this file and generate a .msh file using the following command (we’re forcing a 2D mesh with -2 and gmsh format version 4.1 using -format msh41 here):

gmsh -2 -format msh41 test.geo test.msh

And last but not least using Gmsh’s Python API it looks like this:

import gmsh
import numpy as np

gmsh.initialize()
gmsh.model.add("test")
gmsh.logger.start()

r1 = gmsh.model.occ.addRectangle(0, 0, 0, 5e-3, 2.5e-3)
gmsh.model.occ.synchronize()

pg1 = gmsh.model.addPhysicalGroup(2, [1], tag=2) # inner surface
pg2 = gmsh.model.addPhysicalGroup(1, [1, 2, 3, 4], tag=1) # outer wall

gmsh.model.mesh.generate(dim=2)

gmsh.write("test.msh")

This example is closely based on an example in gmsh’s docs (t16.geo, t16.py). I would not say I’d know what gmsh.initialize, gmsh.model.add, gmsh.logger.start and gmsh.model.occ.synchronize actually do.

Convert msh-file to two XDMF files (mesh and boundary markers) for Fenics to digest

As I understand it, Fenics currently requires separate files for mesh and – let’s say – mesh annotations (like boundary names/numbers etc.). So, what the discussion came down to is converting the mesh as obtained from gmsh into two xdmf-files: test_mesh.xdmf and test_boundaries.xdmf.

The code below is shamelessly stolen mostly from dokken‘s post and Frankenstein-copy-pasted together with other contributions from forum users.

import meshio

# Let's introduce some symbolic names:
infile_mesh = "test.msh"
outfile_mesh = "test_mesh.xdmf"
outfile_boundary = "test_boundaries.xdmf"

# read input from infile
inmsh = meshio.read(infile_mesh)

# delete third (obj=2) column (axis=1), this strips the z-component
# outpoints = np.delete(arr=inmsh.points, obj=2, axis=1), create (two dimensional!) triangle mesh file
outmsh = meshio.Mesh(points=outpoints,
                      cells=[('triangle', inmsh.get_cells_type("triangle"))],
                      cell_data={'Subdomain': [inmsh.cell_data_dict['gmsh:physical']['triangle']]},
                      field_data=inmsh.field_data)

# write mesh to file
meshio.write(outfile_mesh, outmsh)

# create (two dimensional!) boundary data file
outboundary = meshio.Mesh(points=outpoints,
                           cells=[('line', inmsh.get_cells_type('line') )],
                           cell_data={'Boundary': [inmsh.cell_data_dict['gmsh:physical']['line']]},
                           field_data=inmsh.field_data)

# write boundary data to file
meshio.write(filename=outfile_boundary, mesh=outboundary)

So, after running this code snipped, you should see two files: test_mesh.xdmf and test_boundaries.xdmf

Import of the XDMF files into Fenics and a minimal use-case

Well, minimal may be something different. But since I’m using this approach for RF-related eigenvalue problems (EVP), this is as close to minimal as it gets:

import dolfin as do
import numpy as np
import matplotlib.pyplot as plt # for plotting mesh/results

# Import Mesh
mesh = do.Mesh()
with do.XDMFFile(outfile_mesh) as meshfile, \
        do.XDMFFile(outfile_boundary) as boundaryfile:
    meshfile.read(mesh)
    mvc = do.MeshValueCollection("size_t", mesh, 2)
    boundaryfile.read(mvc, "Boundary")
    outerwall = do.MeshFunction("size_t", mesh, mvc)

# Uncomment, if you want to see the mesh
# do.plot(mesh); plt.show()

# Generate Function Space
FE = do.FiniteElement("RTE", mesh.ufl_cell(), 1)
FS = do.FunctionSpace(mesh, FE)

# Use markers for boundary conditions
bcs = [
     do.DirichletBC(FS, do.Constant((0.0, 0.0)), outerwall, 1),
     ]

# Trial and Test functions
E = do.TrialFunction(FS)
EE = do.TestFunction(FS)

# Helmholtz EVP (Ae = -k_co^2B*e)
a = do.inner(do.curl(E), do.curl(EE))*do.dx
b = do.inner(E, EE)*do.dx

# For EVP make use of PETSc and SLEPc
dummy = E[0]*do.dx
A = do.PETScMatrix()
B = do.PETScMatrix()

# Assemble System
do.assemble_system(a, dummy, bcs, A_tensor=A)
do.assemble_system(b, dummy, bcs, A_tensor=B)
# Apply Boundaries
[bc.apply(B) for bc in bcs]
[bc.apply(A) for bc in bcs]

# Let SLEPc solve that generalised EVP
solver = do.SLEPcEigenSolver(A, B)
solver.parameters["solver"] = "krylov-schur"
solver.parameters["tolerance"] = 1e-16
solver.parameters["problem_type"] = "gen_hermitian"
solver.parameters["spectrum"] = "target magnitude"
solver.parameters["spectral_transform"] = "shift-and-invert"
solver.parameters["spectral_shift"] = -(2.np.pi/10e-3)*2

neigs = 10
solver.solve(neigs)

print(f"Found {solver.get_number_converged():d} solutions.")

# Return the computed eigenvalues in a sorted array
computed_eigenvalues = []

for i in range(min(neigs, solver.get_number_converged())):
    r, _, fieldRe, fieldIm = solver.get_eigenpair(i) # ignore the imaginary part
    f = do.Function(FS)
    f.vector()[:] = fieldRe

    if np.abs(r) > 1.1:
    # With r = k_co^2, find k_co = sqrt(r)
    k_co = np.sqrt(r)
    do.plot(f); plt.title(f"{gamma:f}"); plt.show()
    computed_eigenvalues.append(r)

print("All computed eigenvalues:")
print(np.sort(np.array(computed_eigenvalues)))

This gives me a 3 modes for this 5 mm by 2.5 mm waveguide and a bunch of spurious modes. That are filtered out because their k_co^2 is 1 or smaller.

Conclusion

This used to be easier (let’s say back in 2016). I’m fairly sure this is not the most efficient way (both programmatically and numerically) and I’m sure as well that there are better ways to solve this problem. For instance, instead of a generic triangular mesh a more structured triangular mesh or even a quad mesh would produce better solutions. Also, a higher order finite element should produce “smoother” solutions.

I also wanted to add a code block for pygmsh, but it turned out that there are a few catches with physical labels, how they’re numbered and labelled, and how to handle them in Fenics that I could not figure out. So, there’s a separate post on that here.

I am happy and thankful for any feedback!

How to check your package versions

import fenics
import meshio
import gmsh
import pygmsh

print("fenics:", fenics.__version__,
      "meshio:", meshio.__version__,
      "gmsh:", gmsh.__version__,
      "pygmsh:", pygmsh.__version__)