XML toolkit from the GNOME project Full documentation is available on-line at http://xmlsoft.org/ This code is released under the MIT Licence see the Copyright file. To build on an Unixised setup: ./configure ; make ; make install if the ./configure file does not exist, run ./autogen.sh instead. To build on Windows: see instructions on win32/Readme.txt To assert build quality: on an Unixised setup: run make tests otherwise: There is 3 standalone tools runtest.c runsuite.c testapi.c, which should compile as part of the build or as any application would. Launch them from this directory to get results, runtest checks the proper functioning of libxml2 main APIs while testapi does a full coverage check. Report failures to the list. To report bugs, follow the instructions at: http://xmlsoft.org/bugs.html A mailing-list [email protected] is available, to subscribe: http://mail.gnome.org/mailman/listinfo/xml The list archive is at: http://mail.gnome.org/archives/xml/ All technical answers asked privately will be automatically answered on the list and archived for public access unless privacy is explicitly required and justified. Daniel Veillard $Id$
XML toolkit from the GNOME project
Overview
Comments
-
The Root element should retain all but empty ns declarations.
As far as I can see the C14N serialization is not according to specification here. As the output of namespace declarations is done as needed (I.e. the namespace usage on an element triggers the output of the declaration of the namespace) all namespace declarations on the root elements are not preserved as they should be according to the specification.
According to http://www.w3.org/TR/2001/REC-xml-c14n-20010315 :
"The root document element is handled specially since it has no parent element. All namespace declarations in it are retained, except the declaration of an empty default namespace is automatically omitted."
Perhaps this is not the optimal way of doing this, feel free to point me in the right direction if this is the case.
Cheers.
-
Update xmlParseBalancedChunkMemoryRecover to handle parseFlags from xmlNewDoc
Hello, I am proposing a little change to support parse options in xmlParseBalancedChunkMemory. I do not understand how to do it better in this case, but this small fix helps, maybe you can tell me how to do it better.
I tell you what bug I am trying to fix, don't blame me if I am doing something wrong - this is my first try to contribute to big open source projects.
I've met a fail of loading big xml file into postgres database, that caused "Segmentation fault" error with ubuntu's version of library libxml2 "libxml2:amd64 2.9.4+dfsg1-6.1ubuntu1.3" and postgresql 12.2-2.pgdg18.04+1. (It is awkward, because simplexml_load_file in php7.2 that uses same libxml2 do not need any flags and opens entire (bigger) file successfully)
My next move is to understand what is going wrong - I've built "libxml2.so.2.9.10" from source and got a good exception about that libxml cannot parse document because of too many childs in tag (line 74032: internal error: Huge input lookup). Googling this exception led me to a solution to use xmlReadMemory instead of xmlParseMemory and pass there XML_PARSE_HUGE flag. I've opened postgresql sources and patched all occurences of xmlParseMemory to customizable variant, but there was one place that was using another function (xmlParseBalancedChunkMemoryRecover see usage below) where I was unable to pass a flag. I've found that there is an option inside doc, that contains that parse flags(doc->parseFlags), but further investigation led me to that flags inside doc are not being used in parser context.
I think, that my patch to postgresql can be moved under configure flag with version dependence to support huge xml documents, but only after this small patch to libxml.
Postgresql is using libxml2 like this:
doc = xmlNewDoc(version); Assert(doc->encoding == NULL); doc->encoding = xmlStrdup((const xmlChar *) "UTF-8"); doc->standalone = standalone; doc->parseFlags |= XML_PARSE_HUGE; // <--------- propose to add flag here /* allow empty content */ if (*(utf8string + count)) { res_code = xmlParseBalancedChunkMemory(doc, NULL, NULL, 0, utf8string + count, NULL); if (res_code != 0 || xmlerrcxt->err_occurred) xml_ereport(xmlerrcxt, ERROR, ERRCODE_INVALID_XML_CONTENT, "invalid XML content"); }
After that improvements - that my huge document inserted good, xpath queries are working and 100 gigs of xmls are imported without any new crashes of database.
List of relations Schema | Name | Type | Owner | Size | Description --------+----------------+-------+-------+---------+------------- public | egrip | table | egrip | 16 GB | public | egrip_test | table | egrip | 13 MB | public | egrip_versions | table | egrip | 17 GB | public | egrul | table | egrip | 33 GB | public | egrul_versions | table | egrip | 45 GB |
If you will be so kind to apply my patch in new release, that will make my further patch to postgresql possible. Thank you in advance, hope that you can help with this.
Notes: Here is my commits to fix that XML_PARSE_HUGE problem in postgresql: https://github.com/ramzes642/postgres/commit/6eae093d9d1331fa9de92e41f463c263aaf3b641 - no need to modify libxml2 commit https://github.com/ramzes642/postgres/commit/b59459a16b13de718dde21642452dbdbb253c316 - modification needed commit
-
Patch to set IO error on file not found
Currently whenever a file is not present for reading, NULL is simply returned after calling xmlCheckFilename() without setting any errors, thus not catching any errors in error callbacks. This makes it difficult to understand what caused the file opening to fail. Calling xmlIOErr(0, path) on it's negative scenario, sets the error appropriately so we can handle the file not found scenario effectively.
-
Add ends-with function
Hello,
I would like to add a few functions to xpath.c; I have started with "ends-with", similar to the already existing "starts-with" function. I have updated accordingly the test suite and the documentation (at least I think I did so). I would like to know whether it is OK to add new functions, and I have a few questions :
- how do I update documentation (I have edited by hand several files, with similar contents)
- how do I test the Python bindings (I do not think I have seen some Python tests)
Regards,
Philippe Marguinaud
-
[cmake] Move options, correct install include path. fix libxml2-config.cmake
-
Move the options for dependencies to the top of
find_package
Reason: Automatically setting options for dependencies based on the search results of dependencies is an inappropriate thing: it may cause actual results to be inconsistent with expectations. So when the option is set to ON but the corresponding dependency cannot be found, it should report the error "Dependency not found". -
Provide macro
LIBXML2_INCLUDE_DIR
Reason: Since libxml2 did not provide cmake export function before, most libraries use FindLibxml2.cmake in cmake, and some libraries use theLIBXML2_INCLUDE_DIR
macro provided by cmake. Add this macro to better adapt to existing scenes. -
Use the attributes of the target
LibXml2::LibXml2
inlibxml2-config.cmake
instead of manually setting some macros (such asLIBXML2_INCLUDE_DIRS
andLIBXML2_LIBRARIES
). Reason: Since libxml2 now provides cmake building methods, we should trust the attributes automatically exported by cmake rather than manually implement them. In this way, when we modify the code inCMakeLists.txt
, we do not need to modifylibxml2.config.cmake.cmake.in
synchronously.
Related: https://github.com/microsoft/vcpkg/pull/14588
-
-
Fix building with ICU 68.
ICU 68 no longer defines the
TRUE
macro, as outlined in their updated Coding Guidelines.This causes building libxml2 with ICU 68 to fail with the following error:
encoding.c:1961:31: error: use of undeclared identifier ‘TRUE’ TRUE); ^
Given that
xmlUconvWrapper()
defines the parameter asint flush
, using1
instead ofTRUE
seems like the best solution. -
Add environment CFLAGS to makefile.msvc
Libxml2 uses its own CFLAGS and does not consider additional flags like guard:cf and QSpectre that we can easily set in our environment for libxml2 to pick up. As a result, we have to manually rebuild libxml2 by adding the additional flags in Makefile.msvc everytime we want a new flag added. This change will give flexibility to set the needed CFLAGS in the environment itself
-
Support file size bigger than 2G on Windows
stat() of all current MSVCRT versions is only 32-bit capable even when compiled for x64. From VS2010 onward, if a file is bigger than 2G (or 4G for some versions and in those cases st_size could become negative and dangerous to use), stat() would return -1. While stat() of VS2008 would return 0 for those big files, the file size in st_size would be truncated and dangerous to use. To support files of size bigger than 2G, _stat64() should be used and then all following logic needs to aware that st_size is now of (at least) 64 bits. Btw, I have also fixed a couple of locations to aware that file size can be such big. Simply to return error immediately instead of moving on to logic which is not capable to handle files of such size.
-
appveyor.yml for windows CI build
-
initial version
-
standard configuration without dependency to iconv
-
see https://ci.appveyor.com/project/chcg/libxml2/build/2.9.8.8
-
could be extended for publishing artifacts, see https://ci.appveyor.com/project/chcg/libxml2/build/2.9.8.8/job/msbsc4fx4dofthh3/artifacts and create deployments zips on events e.g. tags
-
also mingw is available from appveyor CI VMs und could be configured, if such a build is wanted
-
-
added autogen.sh line to README and INSTALL.libxml2 files
There is currently no configure file in the root libxml2 directory. You have to run autogen.sh to create this, but this is not listed in the README or INSTALL.libxml2 files, both just tell users to run configure which doesnt yet exist.
-
threads.c: compare function pointers using &func
Clang warns that a function is always non-NULL. In C, a function without a call and its address are the same thing, but Clang doesn't warn about the latter. Use it instead.
Owner
GNOME Github Mirror
Budgie Screensaver is a fork of old gnome screensaver for purposes of providing an authentication prompt on wake.
budgie-screensaver Budgie Screensaver is a fork of gnome-screensaver intended for use with Budgie Desktop and is similar in purpose to other screensav
Budgie Control Center is a fork of GNOME Control Center for the Budgie 10 Series.
Budgie Control Center Budgie Control Center is a fork of GNOME Settings / GNOME Control Center with the intent of providing a simplified list of setti
An extension manager for browsing and installing GNOME Shell Extensions.
Extension Manager A native tool for browsing, installing, and managing GNOME Shell Extensions. Written with GTK 4 and libadwaita. Features The tool su
Building blocks for modern GNOME applications
Adwaita Building blocks for modern GNOME applications. License Libadwaita is licensed under the LGPL-2.1+. Building We use the Meson (and thereby Ninj
A C++ binding for the OpenGL API, generated using the gl.xml specification.
glbinding is a cross-platform C++ binding for the OpenGL API. glbinding leverages C++11 features like enum classes, lambdas, and variadic templates, i
A library to handle Apple Property List format in binary or XML
libplist A small portable C library to handle Apple Property List files in binary or XML format. Features The project provides an interface to read an
Minimum Bait Cover Toolkit Syotti.
Minimum Bait Cover Toolkit Syotti This is a set of command line tools to compute a cover for a set of reference sequences using short bait strings.
Cyber Ghost-->Simple toolkit for basic cyber security students.
CYBOST Tool Cyber Ghost Tool This tool is under development I developed this tool at 12 Nov 2021,I was 15 years old How to use the tool: bash setup fo
A toolkit for pointcloud processing, including: filter, bounding box, ground segmentation, cluster
A toolkit for pointcloud processing, including: filter, bounding box, ground segmentation, cluster. And implemented by different algorithms(some with pcl wrapper). c++17 supported
Node.js bindings for the Mathematical Expression Toolkit
ExprTk.js This is the Node.js bindings for ExprTk (Github) by @ArashPartow ExprTk.js supports both synchronous and asynchronous background execution o
Blumentals Program Protector v4.x anti protection toolkit
VeNoM A Blumentals Program Protector v4.x anti protection toolkit. Reverse engineering proof-of-concept code. Screenshot & demo venomdemo.mp4 Usage Th
Dynamic Animation and Robotics Toolkit
Build Status Item Status Build Status API Documentation Coverage Static Analysis Resources Visit the DART website for more information Gallery Install
Haxe - The Cross-Platform Toolkit
Haxe is an open source toolkit that allows you to easily build cross-platform tools and applications that target many mainstream platforms. The Haxe t
The Synthesis ToolKit in C++ (STK) is a set of open source audio signal processing and algorithmic synthesis classes written in the C++ programming language.
The Synthesis ToolKit in C++ (STK) By Perry R. Cook and Gary P. Scavone, 1995--2021. This distribution of the Synthesis ToolKit in C++ (STK) contains
C++ Multiplatform Modular Toolkit Template
C++ Multiplatform Modular Toolkit Template Nativium Philosophy: Write Once And Compile Anywhere About Write a single code in C++ and compile for any p
Custom code toolkit for Super Mario Galaxy 2.
Syati Syati is a coding toolkit for custom code injections in Super Mario Galaxy 2. It is able to compile code, link to existing functions and structu
dogefetch, a project made in C for doges, very project, much wow
dogefetch dogefetch, a project made in C for doges, very project, much wow. before installing this project uses nerd fonts, please install it to see t
this project is a function in c to take the next line of a file or a file descriptor. this is a project of 42 school.
Get Next Line of 42. Make with ❤︎ for Luiz Cezario ?? Index What's this Repo? List of Archives Technologies How to Run Find a Bug? Or somenthing need
The Project name is "ATM - Automated Teller Machine" and It is for beginners level Project.
ATM - Automated Teller Machine The Project name is "ATM - Automated Teller Machine" and It is for beginners level Project. What is ATM? An automated t