If you are faced with any sort of text formatting conversion problem, you should probably start with pandoc to solve it.
Suppose you have a portion of a website that you would like to turn into an ebook.
wget -nc -nd -v -r -l1 $URLto retrieve it.
-rsets up recursion, and
-l1limits it to one level deep. You might need more than that.
Assemble your ebook with
pandoc $file1 $file2 $file3... -t epub -o $ebook.epub
There are lots of other options, including the ability to add your own CSS, table of contents, cover image… It’s really quite nice.