Spaces:
Runtime error
Runtime error
File size: 125,863 Bytes
c19ca42 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 |
# Change Log for SD.Next
## TODO
- reference styles
- quick apply style
## Update for 2024-03-19
### Highlights 2024-03-19
New models:
- [Stable Cascade](https://github.com/Stability-AI/StableCascade) *Full* and *Lite*
- [Playground v2.5](https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic)
- [KOALA 700M](https://github.com/youngwanLEE/sdxl-koala)
- [Stable Video Diffusion XT 1.1](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1)
- [VGen](https://huggingface.co/ali-vilab/i2vgen-xl)
New pipelines and features:
- Img2img using [LEdit++](https://leditsplusplus-project.static.hf.space/index.html), context aware method with image analysis and positive/negative prompt handling
- Trajectory Consistency Distillation [TCD](https://mhh0318.github.io/tcd) for processing in even less steps
- Visual Query & Answer using [moondream2](https://github.com/vikhyat/moondream) as an addition to standard interrogate methods
- **Face-HiRes**: simple built-in detailer for face refinements
- Even simpler outpaint: when resizing image, simply pick outpaint method and if image has different aspect ratio, blank areas will be outpainted!
- UI aspect-ratio controls and other UI improvements
- User controllable invisibile and visible watermarking
- Native composable LoRA
What else?
- **Reference models**: *Networks -> Models -> Reference*: All reference models now come with recommended settings that can be auto-applied if desired
- **Styles**: Not just for prompts! Styles can apply *generate parameters* as templates and can be used to *apply wildcards* to prompts
improvements, Additional API endpoints
- Given the high interest in [ZLUDA](https://github.com/vosen/ZLUDA) engine introduced in last release we've updated much more flexible/automatic install procedure (see [wiki](https://github.com/vladmandic/automatic/wiki/ZLUDA) for details)
- Plus Additional Improvements such as: Smooth tiling, Refine/HiRes workflow improvements, Control workflow
Further details:
- For basic instructions, see [README](https://github.com/vladmandic/automatic/blob/master/README.md)
- For more details on all new features see full [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md)
- For documentation, see [WiKi](https://github.com/vladmandic/automatic/wiki)
- [Discord](https://discord.com/invite/sd-next-federal-batch-inspectors-1101998836328697867) server
### Full Changelog 2024-03-19
- [Stable Cascade](https://github.com/Stability-AI/StableCascade) *Full* and *Lite*
- large multi-stage high-quality model from warp-ai/wuerstchen team and released by stabilityai
- download using networks -> reference
- see [wiki](https://github.com/vladmandic/automatic/wiki/Stable-Cascade) for details
- [Playground v2.5](https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic)
- new model version from Playground: based on SDXL, but with some cool new concepts
- download using networks -> reference
- set sampler to *DPM++ 2M EDM* or *Euler EDM*
- [KOALA 700M](https://github.com/youngwanLEE/sdxl-koala)
- another very fast & light sdxl model where original unet was compressed and distilled to 54% of original size
- download using networks -> reference
- *note* to download fp16 variant (recommended), set settings -> diffusers -> preferred model variant
- [LEdit++](https://leditsplusplus-project.static.hf.space/index.html)
- context aware img2img method with image analysis and positive/negative prompt handling
- enable via img2img -> scripts -> ledit
- uses following params from standard img2img: cfg scale (recommended ~3), steps (recommended ~50), denoise strength (recommended ~0.7)
- can use postive and/or negative prompt to guide editing process
- positive prompt: what to enhance, strength and threshold for auto-masking
- negative prompt: what to remove, strength and threshold for auto-masking
- *note*: not compatible with model offloading
- **Second Pass / Refine**
- independent upscale and hires options: run hires without upscale or upscale without hires or both
- upscale can now run 0.1-8.0 scale and will also run if enabled at 1.0 to allow for upscalers that simply improve image quality
- update ui section to reflect changes
- *note*: behavior using backend:original is unchanged for backwards compatibilty
- **Visual Query** visual query & answer in process tab
- go to process -> visual query
- ask your questions, e.g. "describe the image", "what is behind the subject", "what are predominant colors of the image?"
- primary model is [moondream2](https://github.com/vikhyat/moondream), a *tiny* 1.86B vision language model
*note*: its still 3.7GB in size, so not really tiny
- additional support for multiple variations of several base models: *GIT, BLIP, ViLT, PIX*, sizes range from 0.3 to 1.7GB
- **Video**
- **Image2Video**
- new module for creating videos from images
- simply enable from *img2img -> scripts -> image2video*
- model is auto-downloaded on first use
- based on [VGen](https://huggingface.co/ali-vilab/i2vgen-xl)
- **Stable Video Diffusion**
- updated with *SVD 1.0, SVD XT 1.0 and SVD XT 1.1*
- models are auto-downloaded on first use
- simply enable from *img2img -> scripts -> stable video diffusion*
- for svd 1.0, use frames=~14, for xt models use frames=~25
- **Composable LoRA**, thanks @AI-Casanova
- control lora strength for each step
for example: `<xxx:0.1@0,0.9@1>` means strength=0.1 for step at 0% and intepolate towards strength=0.9 for step at 100%
- *note*: this is a very experimental feature and may not work as expected
- **Control**
- added *refiner/hires* workflows
- added resize methods to before/after/mask: fixed, crop, fill
- **Styles**: styles are not just for prompts!
- new styles editor: *networks -> styles -> edit*
- styles can apply generate parameters, for example to have a style that enables and configures hires:
parameters=`enable_hr: True, hr_scale: 2, hr_upscaler: Latent Bilinear antialias, hr_sampler_name: DEIS, hr_second_pass_steps: 20, denoising_strength: 0.5`
- styles can apply wildcards to prompts, for example:
wildcards=`movie=mad max, dune, star wars, star trek; intricate=realistic, color sketch, pencil sketch, intricate`
- as usual, you can apply any number of styles so you can choose which settings are applied and in which order and which wildcards are used
- **UI**
- *aspect-ratio** add selector and lock to width/height control
allowed aspect ration can be configured via *settings -> user interface*
- *interrogate* tab is now merged into *process* tab
- *image viewer* now displays image metadata
- *themes* improve on-the-fly switching
- *log monitor* flag server warnings/errors and overall improve display
- *control* separate processor settings from unit settings
- **Face HiRes**
- new *face restore* option, works similar to well-known *adetailer* by running an inpaint on detected faces but with just a checkbox to enable/disable
- set as default face restorer in settings -> postprocessing
- disabled by default, to enable simply check *face restore* in your generate advanced settings
- strength, steps and sampler are set using by hires section in refine menu
- strength can be overriden in settings -> postprocessing
- will use secondary prompt and secondary negative prompt if present in refine
- **Watermarking**
- SD.Next disables all known watermarks in models, but does allow user to set custom watermark
- see *settings -> image options -> watermarking*
- invisible watermark: using steganogephy
- image watermark: overlaid on top of image
- **Reference models**
- additional reference models available for single-click download & run:
*Stable Cascade, Stable Cascade lite, Stable Video Diffusion XT 1.1*
- reference models will now download *fp16* variation by default
- reference models will print recommended settings to log if present
- new setting in extra network: *use reference values when available*
disabled by default, if enabled will force use of reference settings for models that have them
- **Samplers**
- [TCD](https://mhh0318.github.io/tcd/): Trajectory Consistency Distillation
new sampler that produces consistent results in a very low number of steps (comparable to LCM but without reliance on LoRA)
for best results, use with TCD LoRA: <https://huggingface.co/h1t/TCD-SDXL-LoRA>
- *DPM++ 2M EDM* and *Euler EDM*
EDM is a new solver algorithm currently available for DPM++2M and Euler samplers
Note that using EDM samplers with non-EDM optimized models will provide just noise and vice-versa
- **Improvements**
- **FaceID** extend support for LoRA, HyperTile and FreeU, thanks @Trojaner
- **Tiling** now extends to both Unet and VAE producing smoother outputs, thanks @AI-Casanova
- new setting in image options: *include mask in output*
- improved params parsing from from prompt string and styles
- default theme updates and additional built-in theme *black-gray*
- support models with their own YAML model config files
- support models with their own JSON per-component config files, for example: `playground-v2.5_vae.config`
- prompt can have comments enclosed with `/*` and `*/`
comments are extracted from prompt and added to image metadata
- **ROCm**
- add **ROCm** 6.0 nightly option to installer, thanks @jicka
- add *flash attention* support for rdna3, thanks @Disty0
install flash_attn package for rdna3 manually and enable *flash attention* from *compute settings*
to install flash_attn, activate the venv and run `pip install -U git+https://github.com/ROCm/flash-attention@howiejay/navi_support`
- **IPEX**
- disabled IPEX Optimize by default
- **API**
- add preprocessor api endpoints
GET:`/sdapi/v1/preprocessors`, POST:`/sdapi/v1/preprocess`, sample script:`cli/simple-preprocess.py`
- add masking api endpoints
GET:`/sdapi/v1/masking`, POST:`/sdapi/v1/mask`, sample script:`cli/simple-mask.py`
- **Internal**
- improved vram efficiency for model compile, thanks @Disty0
- **stable-fast** compatibility with torch 2.2.1
- remove obsolete textual inversion training code
- remove obsolete hypernetworks training code
- **Refiner** validated workflows:
- Fully functional: SD15 + SD15, SDXL + SDXL, SDXL + SDXL-R
- Functional, but result is not as good: SD15 + SDXL, SDXL + SD15, SD15 + SDXL-R
- **SDXL Lightning** models just-work, just makes sure to set CFG Scale to 0
and choose a best-suited sampler, it may not be the one you're used to (e.g. maybe even basic Euler)
- **Fixes**
- improve *model cpu offload* compatibility
- improve *model sequential offload* compatibility
- improve *bfloat16* compatibility
- improve *xformers* installer to match cuda version and install triton
- fix extra networks refresh
- fix *sdp memory attention* in backend original
- fix autodetect sd21 models
- fix api info endpoint
- fix *sampler eta* in xyz grid, thanks @AI-Casanova
- fix *requires_aesthetics_score* errors
- fix t2i-canny
- fix *differenital diffusion* for manual mask, thanks @23pennies
- fix ipadapter apply/unapply on batch runs
- fix control with multiple units and override images
- fix control with hires
- fix control-lllite
- fix font fallback, thanks @NetroScript
- update civitai downloader to handler new metadata
- improve control error handling
- use default model variant if specified variant doesnt exist
- use diffusers lora load override for *lcm/tcd/turbo loras*
- exception handler around vram memory stats gather
- improve ZLUDA installer with `--use-zluda` cli param, thanks @lshqqytiger
## Update for 2024-02-22
Only 3 weeks since last release, but here's another feature-packed one!
This time release schedule was shorter as we wanted to get some of the fixes out faster.
### Highlights 2024-02-22
- **IP-Adapters** & **FaceID**: multi-adapter and multi-image suport
- New optimization engines: [DeepCache](https://github.com/horseee/DeepCache), [ZLUDA](https://github.com/vosen/ZLUDA) and **Dynamic Attention Slicing**
- New built-in pipelines: [Differential diffusion](https://github.com/exx8/differential-diffusion) and [Regional prompting](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#regional-prompting-pipeline)
- Big updates to: **Outpainting** (noised-edge-extend), **Clip-skip** (interpolate with non-integrer values!), **CFG end** (prevent overburn on high CFG scales), **Control** module masking functionality
- All reported issues since the last release are addressed and included in this release
Further details:
- For basic instructions, see [README](https://github.com/vladmandic/automatic/blob/master/README.md)
- For more details on all new features see full [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md)
- For documentation, see [WiKi](https://github.com/vladmandic/automatic/wiki)
- [Discord](https://discord.com/invite/sd-next-federal-batch-inspectors-1101998836328697867) server
### Full ChangeLog for 2024-02-22
- **Improvements**:
- **IP Adapter** major refactor
- support for **multiple input images** per each ip adapter
- support for **multiple concurrent ip adapters**
*note*: you cannot mix & match ip adapters that use different *CLiP* models, for example `Base` and `Base ViT-G`
- add **adapter start/end** to settings, thanks @AI-Casanova
having adapter start late can help with better control over composition and prompt adherence
having adapter end early can help with overal quality and performance
- unified interface in txt2img, img2img and control
- enhanced xyz grid support
- **FaceID** now also works with multiple input images!
- [Differential diffusion](https://github.com/exx8/differential-diffusion)
img2img generation where you control strength of each pixel or image area
can be used with manually created masks or with auto-generated depth-maps
uses general denoising strength value
simply enable from *img2img -> scripts -> differential diffusion*
*note*: supports sd15 and sdxl models
- [Regional prompting](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#regional-prompting-pipeline) as a built-in solution
usage is same as original implementation from @hako-mikan
click on title to open docs and see examples of full syntax on how to use it
simply enable from *scripts -> regional prompting*
*note*: supports sd15 models only
- [DeepCache](https://github.com/horseee/DeepCache) model acceleration
it can produce massive speedups (2x-5x) with no overhead, but with some loss of quality
*settings -> compute -> model compile -> deep-cache* and *settings -> compute -> model compile -> cache interval*
- [ZLUDA](https://github.com/vosen/ZLUDA) experimental support, thanks @lshqqytiger
- ZLUDA is CUDA wrapper that can be used for GPUs without native support
- best use case is *AMD GPUs on Windows*, see [wiki](https://github.com/vladmandic/automatic/wiki/ZLUDA) for details
- **Outpaint** control outpaint now uses new alghorithm: noised-edge-extend
new method allows for much larger outpaint areas in a single pass, even outpaint 512->1024 works well
note that denoise strength should be increased for larger the outpaint areas, for example outpainting 512->1024 works well with denoise 0.75
outpaint can run in *img2img* mode (default) and *inpaint* mode where original image is masked (if inpaint masked only is selected)
- **Clip-skip** reworked completely, thanks @AI-Casanova & @Disty0
now clip-skip range is 0-12 where previously lowest value was 1 (default is still 1)
values can also be decimal to interpolate between different layers, for example `clip-skip: 1.5`, thanks @AI-Casanova
- **CFG End** new param to control image generation guidance, thanks @AI-Casanova
sometimes you want strong control over composition, but you want it to stop at some point
for example, when used with ip-adapters or controlnet, high cfg scale can overpower the guided image
- **Control**
- when performing inpainting, you can specify processing resolution using **size->mask**
- units now have extra option to re-use current preview image as processor input
- **Cross-attention** refactored cross-attention methods, thanks @Disty0
- for backend:original, its unchanged: SDP, xFormers, Doggettxs, InvokeAI, Sub-quadratic, Split attention
- for backend:diffuers, list is now: SDP, xFormers, Batch matrix-matrix, Split attention, Dynamic Attention BMM, Dynamic Attention SDP
note: you may need to update your settings! Attention Slicing is renamed to Split attention
- for ROCm, updated default cross-attention to Scaled Dot Product
- **Dynamic Attention Slicing**, thanks @Disty0
- dynamically slices attention queries in order to keep them under the slice rate
slicing gets only triggered if the query size is larger than the slice rate to gain performance
*Dynamic Attention Slicing BMM* uses *Batch matrix-matrix*
*Dynamic Attention Slicing SDP* uses *Scaled Dot Product*
- *settings -> compute settings -> attention -> dynamic attention slicing*
- **ONNX**:
- allow specify onnx default provider and cpu fallback
*settings -> diffusers*
- allow manual install of specific onnx flavor
*settings -> onnx*
- better handling of `fp16` models/vae, thanks @lshqqytiger
- **OpenVINO** update to `torch 2.2.0`, thanks @Disty0
- **HyperTile** additional options thanks @Disty0
- add swap size option
- add use only for hires pass option
- add `--theme` cli param to force theme on startup
- add `--allow-paths` cli param to add additional paths that are allowed to be accessed via web, thanks @OuticNZ
- **Wiki**:
- added benchmark notes for IPEX, OpenVINO and Olive
- added ZLUDA wiki page
- **Internal**
- update dependencies
- refactor txt2img/img2img api
- enhanced theme loader
- add additional debug env variables
- enhanced sdp cross-optimization control
see *settings -> compute settings*
- experimental support for *python 3.12*
- **Fixes**:
- add variation seed to diffusers txt2img, thanks @AI-Casanova
- add cmd param `--skip-env` to skip setting of environment parameters during sdnext load
- handle extensions that install conflicting versions of packages
`onnxruntime`, `opencv2-python`
- installer refresh package cache on any install
- fix embeddings registration on server startup, thanks @AI-Casanova
- ipex handle dependencies, thanks @Disty0
- insightface handle dependencies
- img2img mask blur and padding
- xyz grid handle ip adapter name and scale
- lazy loading of image may prevent metadata from being loaded on time
- allow startup without valid models folder
- fix interrogate api endpoint
- control fix resize causing runtime errors
- control fix processor override image after processor change
- control fix display grid with batch
- control restore pipeline before running scripts/extensions
- handle pipelines that return dict instead of object
- lora use strict name matching if preferred option is by-filename
- fix inpaint mask only for diffusers
- fix vae dtype mismatch, thanks @Disty0
- fix controlnet inpaint mask
- fix theme list refresh
- fix extensions update information in ui
- fix taesd with bfloat16
- fix model merge manual merge settings, thanks @AI-Casanova
- fix gradio instant update issues for textboxes in quicksettings
- fix rembg missing dependency
- bind controlnet extension to last known working commit, thanks @Aptronymist
- prompts-from-file fix resizable prompt area
## Update for 2024-02-07
Another big release just hit the shelves!
### Highlights 2024-02-07
- A lot more functionality in the **Control** module:
- Inpaint and outpaint support, flexible resizing options, optional hires
- Built-in support for many new processors and models, all auto-downloaded on first use
- Full support for scripts and extensions
- Complete **Face** module
implements all variations of **FaceID**, **FaceSwap** and latest **PhotoMaker** and **InstantID**
- Much enhanced **IPAdapter** modules
- Brand new **Intelligent masking**, manual or automatic
Using ML models (*LAMA* object removal, *REMBG* background removal, *SAM* segmentation, etc.) and with live previews
With granular blur, erode and dilate controls
- New models and pipelines:
**Segmind SegMoE**, **Mixture Tiling**, **InstaFlow**, **SAG**, **BlipDiffusion**
- Massive work integrating latest advances with [OpenVINO](https://github.com/vladmandic/automatic/wiki/OpenVINO), [IPEX](https://github.com/vladmandic/automatic/wiki/Intel-ARC) and [ONNX Olive](https://github.com/vladmandic/automatic/wiki/ONNX-Runtime-&-Olive)
- Full control over brightness, sharpness and color shifts and color grading during generate process directly in latent space
- **Documentation**! This was a big one, with a lot of new content and updates in the [WiKi](https://github.com/vladmandic/automatic/wiki)
Plus welcome additions to **UI performance, usability and accessibility** and flexibility of deployment as well as **API** improvements
And it also includes fixes for all reported issues so far
As of this release, default backend is set to **diffusers** as its more feature rich than **original** and supports many additional models (original backend does remain as fully supported)
Also, previous versions of **SD.Next** were tuned for balance between performance and resource usage.
With this release, focus is more on performance.
See [Benchmark](https://github.com/vladmandic/automatic/wiki/Benchmark) notes for details, but as a highlight, we are now hitting **~110-150 it/s** on a standard nVidia RTX4090 in optimal scenarios!
Further details:
- For basic instructions, see [README](https://github.com/vladmandic/automatic/blob/master/README.md)
- For more details on all new features see full [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md)
- For documentation, see [WiKi](https://github.com/vladmandic/automatic/wiki)
### Full ChangeLog 2024-02-07
- Heavily updated [Wiki](https://github.com/vladmandic/automatic/wiki)
- **Control**:
- new docs:
- [Control overview](https://github.com/vladmandic/automatic/wiki/Control)
- [Control guide](https://github.com/vladmandic/automatic/wiki/Control-Guide), thanks @Aptronymist
- add **inpaint** support
applies to both *img2img* and *controlnet* workflows
- add **outpaint** support
applies to both *img2img* and *controlnet* workflows
*note*: increase denoising strength since outpainted area is blank by default
- new **mask** module
- granular blur (gaussian), erode (reduce or remove noise) and dilate (pad or expand)
- optional **live preview**
- optional **auto-segmentation** using ml models
auto-segmentation can be done using **segment-anything** models or **rembg** models
*note*: auto segmentation will automatically expand user-masked area to segments that include current user mask
- optional **auto-mask**
if you dont provide mask or mask is empty, you can instead use auto-mask to automatically generate mask
this is especially useful if you want to use advanced masking on batch or video inputs and dont want to manually mask each image
*note*: such auto-created mask is also subject to all other selected settings such as auto-segmentation, blur, erode and dilate
- optional **object removal** using LaMA model
remove selected objects from images with a single click
works best when combined with auto-segmentation to remove smaller objects
- masking can be combined with control processors in which case mask is applied before processor
- unmasked part of can is optionally applied to final image as overlay, see settings `mask_apply_overlay`
- support for many additional controlnet models
now built-in models include 30+ SD15 models and 15+ SDXL models
- allow **resize** both *before* and *after* generate operation
this allows for workflows such as: *image -> upscale or downscale -> generate -> upscale or downscale -> output*
providing more flexibility and than standard hires workflow
*note*: resizing before generate can be done using standard upscalers or latent
- implicit **hires**
since hires is only used for txt2img, control reuses existing resize functionality
any image size is used as txt2img target size
but if resize scale is also set its used to additionally upscale image after initial txt2img and for hires pass
- add support for **scripts** and **extensions**
you can now combine control workflow with your favorite script or extension
*note* extensions that are hard-coded for txt2img or img2img tabs may not work until they are updated
- add **depth-anything** depth map processor and trained controlnet
- add **marigold** depth map processor
this is state-of-the-art depth estimation model, but its quite heavy on resources
- add **openpose xl** controlnet
- add blip/booru **interrogate** functionality to both input and output images
- configurable output folder in settings
- auto-refresh available models on tab activate
- add image preview for override images set per-unit
- more compact unit layout
- reduce usage of temp files
- add context menu to action buttons
- move ip-adapter implementation to control tabs
- resize by now applies to input image or frame individually
allows for processing where input images are of different sizes
- support controlnets with non-default yaml config files
- implement resize modes for override images
- allow any selection of units
- dynamically install depenencies required by specific processors
- fix input image size
- fix video color mode
- fix correct image mode
- fix batch/folder/video modes
- fix processor switching within same unit
- fix pipeline switching between different modes
- **Face** module
implements all variations of **FaceID**, **FaceSwap** and latest **PhotoMaker** and **InstantID**
simply select from scripts and choose your favorite method and model
*note*: all models are auto-downloaded on first use
- [FaceID](https://huggingface.co/h94/IP-Adapter-FaceID)
- faceid guides image generation given the input image
- full implementation for *SD15* and *SD-XL*, to use simply select from *Scripts*
**Base** (93MB) uses *InsightFace* to generate face embeds and *OpenCLIP-ViT-H-14* (2.5GB) as image encoder
**Plus** (150MB) uses *InsightFace* to generate face embeds and *CLIP-ViT-H-14-laion2B* (3.8GB) as image encoder
**SDXL** (1022MB) uses *InsightFace* to generate face embeds and *OpenCLIP-ViT-bigG-14* (3.7GB) as image encoder
- [FaceSwap](https://github.com/deepinsight/insightface/blob/master/examples/in_swapper/README.md)
- face swap performs face swapping at the end of generation
- based on InsightFace in-swapper
- [PhotoMaker](https://github.com/TencentARC/PhotoMaker)
- for *SD-XL* only
- new model from TenencentARC using similar concept as IPAdapter, but with different implementation and
allowing full concept swaps between input images and generated images using trigger words
- note: trigger word must match exactly one term in prompt for model to work
- [InstantID](https://github.com/InstantID/InstantID)
- for *SD-XL* only
- based on custom trained ip-adapter and controlnet combined concepts
- note: controlnet appears to be heavily watermarked
- enable use via api, thanks @trojaner
- [IPAdapter](https://huggingface.co/h94/IP-Adapter)
- additional models for *SD15* and *SD-XL*, to use simply select from *Scripts*:
**SD15**: Base, Base ViT-G, Light, Plus, Plus Face, Full Face
**SDXL**: Base SDXL, Base ViT-H SDXL, Plus ViT-H SDXL, Plus Face ViT-H SDXL
- enable use via api, thanks @trojaner
- [Segmind SegMoE](https://github.com/segmind/segmoe)
- initial support for reference models
download&load via network -> models -> reference -> **SegMoE SD 4x2** (3.7GB), **SegMoE XL 2x1** (10GB), **SegMoE XL 4x2**
- note: since segmoe is basically sequential mix of unets from multiple models, it can get large
SD 4x2 is ~4GB, XL 2x1 is ~10GB and XL 4x2 is 18GB
- supports lora, thanks @AI-Casanova
- support for create and load custom mixes will be added in the future
- [Mixture Tiling](https://arxiv.org/abs/2302.02412)
- uses multiple prompts to guide different parts of the grid during diffusion process
- can be used ot create complex scenes with multiple subjects
- simply select from scripts
- [Self-attention guidance](https://github.com/SusungHong/Self-Attention-Guidance)
- simply select scale in advanced menu
- can drastically improve image coherence as well as reduce artifacts
- note: only compatible with some schedulers
- [FreeInit](https://tianxingwu.github.io/pages/FreeInit/) for **AnimateDiff**
- greatly improves temporal consistency of generated outputs
- all options are available in animateddiff script
- [SalesForce BlipDiffusion](https://huggingface.co/docs/diffusers/api/pipelines/blip_diffusion)
- model can be used to place subject in a different context
- requires input image
- last word in prompt and negative prompt will be used as source and target subjects
- sampler must be set to default before loading the model
- [InstaFlow](https://github.com/gnobitab/InstaFlow)
- another take on super-fast image generation in a single step
- set *sampler:default, steps:1, cfg-scale:0*
- load from networks -> models -> reference
- **Improvements**
- **ui**
- check version and **update** SD.Next via UI
simply go to: settings -> update
- globally configurable **font size**
will dynamically rescale ui depending on settings -> user interface
- built-in **themes** can be changed on-the-fly
this does not work with gradio-default themes as css is created by gradio itself
- two new **themes**: *simple-dark* and *simple-light*
- modularized blip/booru interrogate
now appears as toolbuttons on image/gallery output
- faster browser page load
- update hints, thanks @brknsoul
- cleanup settings
- **server**
- all move/offload options are disable by default for optimal performance
enable manually if low on vram
- **server startup**: performance
- reduced module imports
ldm support is now only loaded when running in backend=original
- faster extension load
- faster json parsing
- faster lora indexing
- lazy load optional imports
- batch embedding load, thanks @midcoastal and @AI-Casanova
10x+ faster embeddings load for large number of embeddings, now works for 1000+ embeddings
- file and folder list caching, thanks @midcoastal
if you have a lot of files and and/or are using slower or non-local storage, this speeds up file access a lot
- add `SD_INSTALL_DEBUG` env variable to trace all `git` and `pip` operations
- **extra networks**
- 4x faster civitai metadata and previews lookup
- better display and selection of tags & trigger words
if hashes are calculated, trigger words will only be displayed for actual model version
- better matching of previews
- better search, including searching for multiple keywords or using full regex
see wiki page for more details on syntax
thanks @NetroScript
- reduce html overhead
- **model compression**, thanks @Disty0
- using built-in NNCF model compression, you can reduce the size of your models significantly
example: up to 3.4GB of VRAM saved for SD-XL model!
- see [wiki](https://github.com/vladmandic/automatic/wiki/Model-Compression-with-NNCF) for details
- **embeddings**
you can now use sd 1.5 embeddings with your sd-xl models!, thanks @AI-Casanova
conversion is done on-the-fly, is completely transparent and result is an approximation of embedding
to enable: settings->extra networks->auto-convert embeddings
- **offline deployment**: allow deployment without git clone
for example, you can now deploy a zip of the sdnext folder
- **latent upscale**: updated latent upscalers (some are new)
*nearest, nearest-exact, area, bilinear, bicubic, bilinear-antialias, bicubic-antialias*
- **scheduler**: added `SA Solver`
- **model load to gpu**
new option in settings->diffusers allowing models to be loaded directly to GPU while keeping RAM free
this option is not compatible with any kind of model offloading as model is expected to stay in GPU
additionally, all model-moves can now be traced with env variable `SD_MOVE_DEBUG`
- **xyz grid**
- range control
example: `5.0-6.0:3` will generate 3 images with values `5.0,5.5,6.0`
example: `10-20:4` will generate 4 images with values `10,13,16,20`
- continue on error
now you can use xyz grid with different params and test which ones work and which dont
- correct font scaling, thanks @nCoderGit
- **hypertile**
- enable vae tiling
- add autodetect optimial value
set tile size to 0 to use autodetected value
- **cli**
- `sdapi.py` allow manual api invoke
example: `python cli/sdapi.py /sdapi/v1/sd-models`
- `image-exif.py` improve metadata parsing
- `install-sf` helper script to automatically find best available stable-fast package for the platform
- **memory**: add ram usage monitoring in addition to gpu memory usage monitoring
- **vae**: enable taesd batch decode
enable/disable with settings -> diffusers > vae slicing
- **compile**
- new option: **fused projections**
pretty much free 5% performance boost for compatible models
enable in settings -> compute settings
- new option: **dynamic quantization** (experimental)
reduces memory usage and increases performance
enable in settings -> compute settings
best used together with torch compile: *inductor*
this feature is highly experimental and will evolve over time
requires nightly versions of `torch` and `torchao`
> `pip install -U --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121`
> `pip install -U git+https://github.com/pytorch-labs/ao`
- new option: **compile text encoder** (experimental)
- **correction**
- new section in generate, allows for image corrections during generataion directly in latent space
- adds *brightness*, *sharpness* and *color* controls, thanks @AI-Casanova
- adds *color grading* controls, thanks @AI-Casanova
- replaces old **hdr** section
- **IPEX**, thanks @disty0
- see [wiki](https://github.com/vladmandic/automatic/wiki/Intel-ARC) for details
- rewrite ipex hijacks without CondFunc
improves compatibilty and performance
fixes random memory leaks
- out of the box support for Intel Data Center GPU Max Series
- remove IPEX / Torch 2.0 specific hijacks
- add `IPEX_SDPA_SLICE_TRIGGER_RATE`, `IPEX_ATTENTION_SLICE_RATE` and `IPEX_FORCE_ATTENTION_SLICE` env variables
- disable 1024x1024 workaround if the GPU supports 64 bit
- fix lock-ups at very high resolutions
- **OpenVINO**, thanks @disty0
- see [wiki](https://github.com/vladmandic/automatic/wiki/OpenVINO) for details
- **quantization support with NNCF**
run 8 bit directly without autocast
enable *OpenVINO Quantize Models with NNCF* from *Compute Settings*
- **4-bit support with NNCF**
enable *Compress Model weights with NNCF* from *Compute Settings* and set a 4-bit NNCF mode
select both CPU and GPU from the device selection if you want to use 4-bit or 8-bit modes on GPU
- experimental support for *Text Encoder* compiling
OpenVINO is faster than IPEX now
- update to OpenVINO 2023.3.0
- add device selection to `Compute Settings`
selecting multiple devices will use `HETERO` device
- remove `OPENVINO_TORCH_BACKEND_DEVICE` env variable
- reduce system memory usage after compile
- fix cache loading with multiple models
- **Olive** support, thanks @lshqqytiger
- fully merged in in [wiki](https://github.com/vladmandic/automatic/wiki/ONNX-Runtime-&-Olive), see wiki for details
- as a highlight, 4-5 it/s using DirectML on AMD GPU translates to 23-25 it/s using ONNX/Olive!
- **fixes**
- civitai model download: enable downloads of embeddings
- ipadapter: allow changing of model/image on-the-fly
- ipadapter: fix fallback of cross-attention on unload
- rebasin iterations, thanks @AI-Casanova
- prompt scheduler, thanks @AI-Casanova
- python: fix python 3.9 compatibility
- sdxl: fix positive prompt embeds
- img2img: clip and blip interrogate
- img2img: sampler selection offset
- img2img: support variable aspect ratio without explicit resize
- cli: add `simple-upscale.py` script
- cli: fix cmd args parsing
- cli: add `run-benchmark.py` script
- api: add `/sdapi/v1/version` endpoint
- api: add `/sdapi/v1/platform` endpoint
- api: return current image in progress api if requested
- api: sanitize response object
- api: cleanup error logging
- api: fix api-only errors
- api: fix image to base64
- api: fix upscale
- refiner: fix use of sd15 model as refiners in second pass
- refiner: enable none as option in xyz grid
- sampler: add sampler options info to metadata
- sampler: guard against invalid sampler index
- sampler: add img2img_extra_noise option
- config: reset default cfg scale to 6.0
- hdr: fix math, thanks @AI-Casanova
- processing: correct display metadata
- processing: fix batch file names
- live preview: fix when using `bfloat16`
- live preview: add thread locking
- upscale: fix ldsr
- huggingface: handle fallback model variant on load
- reference: fix links to models and use safetensors where possible
- model merge: unbalanced models where not all keys are present, thanks @AI-Casanova
- better sdxl model detection
- global crlf->lf switch
- model type switch if there is loaded submodels
- cleanup samplers use of compute devices, thanks @Disty0
- **other**
- extensions `sd-webui-controlnet` is locked to commit `ecd33eb` due to breaking changes
- extension `stable-diffusion-webui-images-browser` is locked to commit `27fe4a7` due to breaking changes
- updated core requirements
- fully dynamic pipelines
pipeline switch is now done on-the-fly and does not require manual initialization of individual components
this allows for quick implementation of new pipelines
see `modules/sd_models.py:switch_pipe` for details
- major internal ui module refactoring
this may cause compatibility issues if an extension is doing a direct import from `ui.py`
in which case, report it so we can add a compatibility layer
- major public api refactoring
this may cause compatibility issues if an extension is doing a direct import from `api.py` or `models.py`
in which case, report it so we can add a compatibility layer
## Update for 2023-12-29
To wrap up this amazing year, were releasing a new version of [SD.Next](https://github.com/vladmandic/automatic), this one is absolutely massive!
### Highlights 2023-12-29
- Brand new Control module for *text, image, batch and video* processing
Native implementation of all control methods for both *SD15* and *SD-XL*
โน **ControlNet | ControlNet XS | Control LLLite | T2I Adapters | IP Adapters**
For details, see [Wiki](https://github.com/vladmandic/automatic/wiki/Control) documentation:
- Support for new models types out-of-the-box
This brings number of supported t2i/i2i model families to 13!
โน **Stable Diffusion 1.5/2.1 | SD-XL | LCM | Segmind | Kandinsky | Pixart-ฮฑ | Wรผrstchen | aMUSEd | DeepFloyd IF | UniDiffusion | SD-Distilled | BLiP Diffusion | etc.**
- New video capabilities:
โน **AnimateDiff | SVD | ModelScope | ZeroScope**
- Enhanced platform support
โน **Windows | Linux | MacOS** with **nVidia | AMD | IntelArc | DirectML | OpenVINO | ONNX+Olive** backends
- Better onboarding experience (first install)
with all model types available for single click download & load (networks -> reference)
- Performance optimizations!
For comparisment of different processing options and compile backends, see [Wiki](https://github.com/vladmandic/automatic/wiki/Benchmark)
As a highlight, were reaching **~100 it/s** (no tricks, this is with full features enabled and end-to-end on a standard nVidia RTX4090)
- New [custom pipelines](https://github.com/vladmandic/automatic/blob/dev/scripts/example.py) framework for quickly porting any new pipeline
And others improvements in areas such as: Upscaling (up to 8x now with 40+ available upscalers), Inpainting (better quality), Prompt scheduling, new Sampler options, new LoRA types, additional UI themes, better HDR processing, built-in Video interpolation, parallel Batch processing, etc.
Plus some nifty new modules such as **FaceID** automatic face guidance using embeds during generation and **Depth 3D** image to 3D scene
### Full ChangeLog 2023-12-29
- **Control**
- native implementation of all image control methods:
**ControlNet**, **ControlNet XS**, **Control LLLite**, **T2I Adapters** and **IP Adapters**
- top-level **Control** next to **Text** and **Image** generate
- supports all variations of **SD15** and **SD-XL** models
- supports *Text*, *Image*, *Batch* and *Video* processing
- for details and list of supported models and workflows, see Wiki documentation:
<https://github.com/vladmandic/automatic/wiki/Control>
- **Diffusers**
- [Segmind Vega](https://huggingface.co/segmind/Segmind-Vega) model support
- small and fast version of **SDXL**, only 3.1GB in size!
- select from *networks -> reference*
- [aMUSEd 256](https://huggingface.co/amused/amused-256) and [aMUSEd 512](https://huggingface.co/amused/amused-512) model support
- lightweigt models that excel at fast image generation
- *note*: must select: settings -> diffusers -> generator device: unset
- select from *networks -> reference*
- [Playground v1](https://huggingface.co/playgroundai/playground-v1), [Playground v2 256](https://huggingface.co/playgroundai/playground-v2-256px-base), [Playground v2 512](https://huggingface.co/playgroundai/playground-v2-512px-base), [Playground v2 1024](https://huggingface.co/playgroundai/playground-v2-1024px-aesthetic) model support
- comparable to SD15 and SD-XL, trained from scratch for highly aesthetic images
- simply select from *networks -> reference* and use as usual
- [BLIP-Diffusion](https://dxli94.github.io/BLIP-Diffusion-website/)
- img2img model that can replace subjects in images using prompt keywords
- download and load by selecting from *networks -> reference -> blip diffusion*
- in image tab, select `blip diffusion` script
- [DemoFusion](https://github.com/PRIS-CV/DemoFusion) run your SDXL generations at any resolution!
- in **Text** tab select *script* -> *demofusion*
- *note*: GPU VRAM limits do not automatically go away so be careful when using it with large resolutions
in the future, expect more optimizations, especially related to offloading/slicing/tiling,
but at the moment this is pretty much experimental-only
- [AnimateDiff](https://github.com/guoyww/animatediff/)
- overall improved quality
- can now be used with *second pass* - enhance, upscale and hires your videos!
- [IP Adapter](https://github.com/tencent-ailab/IP-Adapter)
- add support for **ip-adapter-plus_sd15, ip-adapter-plus-face_sd15 and ip-adapter-full-face_sd15**
- can now be used in *xyz-grid*
- **Text-to-Video**
- in text tab, select `text-to-video` script
- supported models: **ModelScope v1.7b, ZeroScope v1, ZeroScope v1.1, ZeroScope v2, ZeroScope v2 Dark, Potat v1**
*if you know of any other t2v models youd like to see supported, let me know!*
- models are auto-downloaded on first use
- *note*: current base model will be unloaded to free up resources
- **Prompt scheduling** now implemented for Diffusers backend, thanks @AI-Casanova
- **Custom pipelines** contribute by adding your own custom pipelines!
- for details, see fully documented example:
<https://github.com/vladmandic/automatic/blob/dev/scripts/example.py>
- **Schedulers**
- add timesteps range, changing it will make scheduler to be over-complete or under-complete
- add rescale betas with zero SNR option (applicable to Euler, Euler a and DDIM, allows for higher dynamic range)
- **Inpaint**
- improved quality when using mask blur and padding
- **UI**
- 3 new native UI themes: **orchid-dreams**, **emerald-paradise** and **timeless-beige**, thanks @illu_Zn
- more dynamic controls depending on the backend (original or diffusers)
controls that are not applicable in current mode are now hidden
- allow setting of resize method directly in image tab
(previously via settings -> upscaler_for_img2img)
- **Optional**
- **FaceID** face guidance during generation
- also based on IP adapters, but with additional face detection and external embeddings calculation
- calculates face embeds based on input image and uses it to guide generation
- simply select from *scripts -> faceid*
- *experimental module*: requirements must be installed manually:
> pip install insightface ip_adapter
- **Depth 3D** image to 3D scene
- delivered as an extension, install from extensions tab
<https://github.com/vladmandic/sd-extension-depth3d>
- creates fully compatible 3D scene from any image by using depth estimation
and creating a fully populated mesh
- scene can be freely viewed in 3D in the UI itself or downloaded for use in other applications
- [ONNX/Olive](https://github.com/vladmandic/automatic/wiki/ONNX-Olive)
- major work continues in olive branch, see wiki for details, thanks @lshqqytiger
as a highlight, 4-5 it/s using DirectML on AMD GPU translates to 23-25 it/s using ONNX/Olive!
- **General**
- new **onboarding**
- if no models are found during startup, app will no longer ask to download default checkpoint
instead, it will show message in UI with options to change model path or download any of the reference checkpoints
- *extra networks -> models -> reference* section is now enabled for both original and diffusers backend
- support for **Torch 2.1.2** (release) and **Torch 2.3** (dev)
- **Process** create videos from batch or folder processing
supports *GIF*, *PNG* and *MP4* with full interpolation, scene change detection, etc.
- **LoRA**
- add support for block weights, thanks @AI-Casanova
example `<lora:SDXL_LCM_LoRA:1.0:in=0:mid=1:out=0>`
- add support for LyCORIS GLora networks
- add support for LoRA PEFT (*Diffusers*) networks
- add support for Lora-OFT (*Kohya*) and Lyco-OFT (*Kohaku*) networks
- reintroduce alternative loading method in settings: `lora_force_diffusers`
- add support for `lora_fuse_diffusers` if using alternative method
use if you have multiple complex loras that may be causing performance degradation
as it fuses lora with model during load instead of interpreting lora on-the-fly
- **CivitAI downloader** allow usage of access tokens for download of gated or private models
- **Extra networks** new *settting -> extra networks -> build info on first access*
indexes all networks on first access instead of server startup
- **IPEX**, thanks @disty0
- update to **Torch 2.1**
if you get file not found errors, set `DISABLE_IPEXRUN=1` and run the webui with `--reinstall`
- built-in *MKL* and *DPCPP* for IPEX, no need to install OneAPI anymore
- **StableVideoDiffusion** is now supported with IPEX
- **8 bit support with NNCF** on Diffusers backend
- fix IPEX Optimize not applying with Diffusers backend
- disable 32bit workarounds if the GPU supports 64bit
- add `DISABLE_IPEXRUN` and `DISABLE_IPEX_1024_WA` environment variables
- performance and compatibility improvements
- **OpenVINO**, thanks @disty0
- **8 bit support for CPUs**
- reduce System RAM usage
- update to Torch 2.1.2
- add *Directory for OpenVINO cache* option to *System Paths*
- remove Intel ARC specific 1024x1024 workaround
- **HDR controls**
- batch-aware for enhancement of multiple images or video frames
- available in image tab
- **Logging**
- additional *TRACE* logging enabled via specific env variables
see <https://github.com/vladmandic/automatic/wiki/Debug> for details
- improved profiling
use with `--debug --profile`
- log output file sizes
- **Other**
- **API** several minor but breaking changes to API behavior to better align response fields, thanks @Trojaner
- **Inpaint** add option `apply_overlay` to control if inpaint result should be applied as overlay or as-is
can remove artifacts and hard edges of inpaint area but also remove some details from original
- **chaiNNer** fix `NaN` issues due to autocast
- **Upscale** increase limit from 4x to 8x given the quality of some upscalers
- **Extra Networks** fix sort
- reduced default **CFG scale** from 6 to 4 to be more out-of-the-box compatibile with LCM/Turbo models
- disable google fonts check on server startup
- fix torchvision/basicsr compatibility
- fix styles quick save
- add hdr settings to metadata
- improve handling of long filenames and filenames during batch processing
- do not set preview samples when using via api
- avoid unnecessary resizes in img2img and inpaint
- safe handling of config updates avoid file corruption on I/O errors
- updated `cli/simple-txt2img.py` and `cli/simple-img2img.py` scripts
- save `params.txt` regardless of image save status
- update built-in log monitor in ui, thanks @midcoastal
- major CHANGELOG doc cleanup, thanks @JetVarimax
- major INSTALL doc cleanup, thanks JetVarimax
## Update for 2023-12-04
Whats new? Native video in SD.Next via both **AnimateDiff** and **Stable-Video-Diffusion** - and including native MP4 encoding and smooth video outputs out-of-the-box, not just animated-GIFs.
Also new is support for **SDXL-Turbo** as well as new **Kandinsky 3** models and cool latent correction via **HDR controls** for any *txt2img* workflows, best-of-class **SDXL model merge** using full ReBasin methods and further mobile UI optimizations.
- **Diffusers**
- **IP adapter**
- lightweight native implementation of T2I adapters which can guide generation towards specific image style
- supports most T2I models, not limited to SD 1.5
- models are auto-downloaded on first use
- for IP adapter support in *Original* backend, use standard *ControlNet* extension
- **AnimateDiff**
- lightweight native implementation of AnimateDiff models:
*AnimateDiff 1.4, 1.5 v1, 1.5 v2, AnimateFace*
- supports SD 1.5 only
- models are auto-downloaded on first use
- for video saving support, see video support section
- can be combined with IP-Adapter for even better results!
- for AnimateDiff support in *Original* backend, use standard *AnimateDiff* extension
- **HDR latent control**, based on [article](https://huggingface.co/blog/TimothyAlexisVass/explaining-the-sdxl-latent-space#long-prompts-at-high-guidance-scales-becoming-possible)
- in *Advanced* params
- allows control of *latent clamping*, *color centering* and *range maximization*
- supported by *XYZ grid*
- [SD21 Turbo](https://huggingface.co/stabilityai/sd-turbo) and [SDXL Turbo](<https://huggingface.co/stabilityai/sdxl-turbo>) support
- just set CFG scale (0.0-1.0) and steps (1-3) to a very low value
- compatible with original StabilityAI SDXL-Turbo or any of the newer merges
- download safetensors or select from networks -> reference
- [Stable Video Diffusion](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid) and [Stable Video Diffusion XT](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt) support
- download using built-in model downloader or simply select from *networks -> reference*
support for manually downloaded safetensors models will be added later
- for video saving support, see video support section
- go to *image* tab, enter input image and select *script* -> *stable video diffusion*
- [Kandinsky 3](https://huggingface.co/kandinsky-community/kandinsky-3) support
- download using built-in model downloader or simply select from *networks -> reference*
- this model is absolutely massive at 27.5GB at fp16, so be patient
- model params count is at 11.9B (compared to SD-XL at 3.3B) and its trained on mixed resolutions from 256px to 1024px
- use either model offload or sequential cpu offload to be able to use it
- better autodetection of *inpaint* and *instruct* pipelines
- support long seconary prompt for refiner
- **Video support**
- applies to any model that supports video generation, e.g. AnimateDiff and StableVideoDiffusion
- support for **animated-GIF**, **animated-PNG** and **MP4**
- GIF and PNG can be looped
- MP4 can have additional padding at the start/end as well as motion-aware interpolated frames for smooth playback
interpolation is done using [RIFE](https://arxiv.org/abs/2011.06294) with native implementation in SD.Next
And its fast - interpolation from 16 frames with 10x frames to target 160 frames results takes 2-3sec
- output folder for videos is in *settings -> image paths -> video*
- **General**
- redesigned built-in profiler
- now includes both `python` and `torch` and traces individual functions
- use with `--debug --profile`
- **model merge** add **SD-XL ReBasin** support, thanks @AI-Casanova
- further UI optimizations for **mobile devices**, thanks @iDeNoh
- log level defaults to info for console and debug for log file
- better prompt display in process tab
- increase maximum lora cache values
- fix extra networks sorting
- fix controlnet compatibility issues in original backend
- fix img2img/inpaint paste params
- fix save text file for manually saved images
- fix python 3.9 compatibility issues
## Update for 2023-11-23
New release, primarily focused around three major new features: full **LCM** support, completely new **Model Merge** functionality and **Stable-fast** compile support
Also included are several other improvements and large number of hotfixes - see full changelog for details
- **Diffusers**
- **LCM** support for any *SD 1.5* or *SD-XL* model!
- download [lcm-lora-sd15](https://huggingface.co/latent-consistency/lcm-lora-sdv1-5/tree/main) and/or [lcm-lora-sdxl](https://huggingface.co/latent-consistency/lcm-lora-sdxl/tree/main)
- load for favorite *SD 1.5* or *SD-XL* model *(original LCM was SD 1.5 only, this is both)*
- load **lcm lora** *(note: lcm lora is processed differently than any other lora)*
- set **sampler** to **LCM**
- set number of steps to some low number, for SD-XL 6-7 steps is normally sufficient
note: LCM scheduler does not support steps higher than 50
- set CFG to between 1 and 2
- Add `cli/lcm-convert.py` script to convert any SD 1.5 or SD-XL model to LCM model
by baking in LORA and uploading to Huggingface, thanks @Disty0
- Support for [Stable Fast](https://github.com/chengzeyi/stable-fast) model compile on *Windows/Linux/WSL2* with *CUDA*
See [Wiki:Benchmark](https://github.com/vladmandic/automatic/wiki/Benchmark) for details and comparison
of different backends, precision modes, advanced settings and compile modes
*Hint*: **70+ it/s** is possible on *RTX4090* with no special tweaks
- Add additional pipeline types for manual model loads when loading from `safetensors`
- Updated logic for calculating **steps** when using base/hires/refiner workflows
- Improve **model offloading** for both model and sequential cpu offload when dealing with meta tensors
- Safe model offloading for non-standard models
- Fix **DPM SDE** scheduler
- Better support for SD 1.5 **inpainting** models
- Add support for **OpenAI Consistency decoder VAE**
- Enhance prompt parsing with long prompts and support for *BREAK* keyword
Change-in-behavior: new line in prompt now means *BREAK*
- Add alternative Lora loading algorithm, triggered if `SD_LORA_DIFFUSERS` is set
- **Models**
- **Model merge**
- completely redesigned, now based on best-of-class `meh` by @s1dlx
and heavily modified for additional functionality and fully integrated by @AI-Casanova (thanks!)
- merge SD or SD-XL models using *simple merge* (12 methods),
using one of *presets* (20 built-in presets) or custom block merge values
- merge with ReBasin permutations and/or clipping protection
- fully multithreaded for fastest merge possible
- **Model update**
- under UI -> Models - Update
- scan existing models for updated metadata on CivitAI and
provide download functionality for models with available
- **Extra networks**
- Use multi-threading for 5x load speedup
- Better Lora trigger words support
- Auto refresh styles on change
- **General**
- Many **mobile UI** optimizations, thanks @iDeNoh
- Support for **Torch 2.1.1** with CUDA 12.1 or CUDA 11.8
- Configurable location for HF cache folder
Default is standard `~/.cache/huggingface/hub`
- Reworked parser when pasting previously generated images/prompts
includes all `txt2img`, `img2img` and `override` params
- Reworked **model compile**
- Support custom upscalers in subfolders
- Add additional image info when loading image in process tab
- Better file locking when sharing config and/or models between multiple instances
- Handle custom API endpoints when using auth
- Show logged in user in log when accessing via UI and/or API
- Support `--ckpt none` to skip loading a model
- **XYZ grid**
- Add refiner options to XYZ Grid
- Add option to create only subgrids in XYZ grid, thanks @midcoastal
- Allow custom font, background and text color in settings
- **Fixes**
- Fix `params.txt` saved before actual image
- Fix inpaint
- Fix manual grid image save
- Fix img2img init image save
- Fix upscale in txt2img for batch counts when no hires is used
- More uniform models paths
- Safe scripts callback execution
- Improved extension compatibility
- Improved BF16 support
- Match previews for reference models with downloaded models
## Update for 2023-11-06
Another pretty big release, this time with focus on new models (3 new model types), new backends and optimizations
Plus quite a few fixes
Also, [Wiki](https://github.com/vladmandic/automatic/wiki) has been updated with new content, so check it out!
Some highlights: [OpenVINO](https://github.com/vladmandic/automatic/wiki/OpenVINO), [IntelArc](https://github.com/vladmandic/automatic/wiki/Intel-ARC), [DirectML](https://github.com/vladmandic/automatic/wiki/DirectML), [ONNX/Olive](https://github.com/vladmandic/automatic/wiki/ONNX-Olive)
- **Diffusers**
- since now **SD.Next** supports **12** different model types, weve added reference model for each type in
*Extra networks -> Reference* for easier select & auto-download
Models can still be downloaded manually, this is just a convenience feature & a showcase for supported models
- new model type: [Segmind SSD-1B](https://huggingface.co/segmind/SSD-1B)
its a *distilled* model trained at 1024px, this time 50% smaller and faster version of SD-XL!
(and quality does not suffer, its just more optimized)
test shows batch-size:4 with 1k images at full quality used less than 6.5GB of VRAM
and for further optimization, you can use built-in **TAESD** decoder,
which results in batch-size:16 with 1k images using 7.9GB of VRAM
select from extra networks -> reference or download using built-in **Huggingface** downloader: `segmind/SSD-1B`
- new model type: [Pixart-ฮฑ XL 2](https://github.com/PixArt-alpha/PixArt-alpha)
in medium/512px and large/1024px variations
comparable in quality to SD 1.5 and SD-XL, but with better text encoder and highly optimized training pipeline
so finetunes can be done in as little as 10% compared to SD/SD-XL (note that due to much larger text encoder, it is a large model)
select from extra networks -> reference or download using built-in **Huggingface** downloader: `PixArt-alpha/PixArt-XL-2-1024-MS`
- new model type: [LCM: Latent Consistency Models](https://github.com/openai/consistency_models)
trained at 512px, but with near-instant generate in a as little as 3 steps!
combined with OpenVINO, generate on CPU takes less than 5-10 seconds: <https://www.youtube.com/watch?v=b90ESUTLsRo>
and absolute beast when combined with **HyperTile** and **TAESD** decoder resulting in **28 FPS**
(on RTX4090 for batch 16x16 at 512px)
note: set sampler to **Default** before loading model as LCM comes with its own *LCMScheduler* sampler
select from extra networks -> reference or download using built-in **Huggingface** downloader: `SimianLuo/LCM_Dreamshaper_v7`
- support for **Custom pipelines**, thanks @disty0
download using built-in **Huggingface** downloader
think of them as plugins for diffusers not unlike original extensions that modify behavior of `ldm` backend
list of community pipelines: <https://github.com/huggingface/diffusers/blob/main/examples/community/README.md>
- new custom pipeline: `Disty0/zero123plus-pipeline`, thanks @disty0
generate 4 output images with different camera positions: front, side, top, back!
for more details, see <https://github.com/vladmandic/automatic/discussions/2421>
- new backend: **ONNX/Olive** *(experimental)*, thanks @lshqqytiger
for details, see [WiKi](https://github.com/vladmandic/automatic/wiki/ONNX-Runtime)
- extend support for [Free-U](https://github.com/ChenyangSi/FreeU)
improve generations quality at no cost (other than finding params that work for you)
- **General**
- attempt to auto-fix invalid samples which occur due to math errors in lower precision
example: `RuntimeWarning: invalid value encountered in cast: sample = sample.astype(np.uint8)`
begone **black images** *(note: if it proves as working, this solution will need to be expanded to cover all scenarios)*
- add **Lora OFT** support, thanks @antis0007 and @ai-casanova
- **Upscalers**
- **compile** option, thanks @disty0
- **chaiNNer** add high quality models from [Helaman](https://openmodeldb.info/users/helaman)
- redesigned **Progress bar** with full details on current operation
- new option: *settings -> images -> keep incomplete*
can be used to skip vae decode on aborted/skipped/interrupted image generations
- new option: *settings -> system paths -> models*
can be used to set custom base path for *all* models (previously only as cli option)
- remove external clone of items in `/repositories`
- **Interrogator** module has been removed from `extensions-builtin`
and fully implemented (and improved) natively
- **UI**
- UI tweaks for default themes
- UI switch core font in default theme to **noto-sans**
previously default font was simply *system-ui*, but it lead to too much variations between browsers and platforms
- UI tweaks for mobile devices, thanks @iDeNoh
- updated **Context menu**
right-click on any button in action menu (e.g. generate button)
- **Extra networks**
- sort by name, size, date, etc.
- switch between *gallery* and *list* views
- add tags from user metadata (in addition to tags in model metadata) for **lora**
- added **Reference** models for diffusers backend
- faster enumeration of all networks on server startup
- **Packages**
- updated `diffusers` to 0.22.0, `transformers` to 4.34.1
- update **openvino**, thanks @disty0
- update **directml**, @lshqqytiger
- **Compute**
- **OpenVINO**:
- updated to mainstream `torch` *2.1.0*
- support for **ESRGAN** upscalers
- **Fixes**
- fix **freeu** for backend original and add it to xyz grid
- fix loading diffuser models in huggingface format from non-standard location
- fix default styles looking in wrong location
- fix missing upscaler folder on initial startup
- fix handling of relative path for models
- fix simple live preview device mismatch
- fix batch img2img
- fix diffusers samplers: dpm++ 2m, dpm++ 1s, deis
- fix new style filename template
- fix image name template using model name
- fix image name sequence
- fix model path using relative path
- fix safari/webkit layour, thanks @eadnams22
- fix `torch-rocm` and `tensorflow-rocm` version detection, thanks @xangelix
- fix **chainner** upscalers color clipping
- fix for base+refiner workflow in diffusers mode: number of steps, diffuser pipe mode
- fix for prompt encoder with refiner in diffusers mode
- fix prompts-from-file saving incorrect metadata
- fix add/remove extra networks to prompt
- fix before-hires step
- fix diffusers switch from invalid model
- force second requirements check on startup
- remove **lyco**, multiple_tqdm
- enhance extension compatibility for extensions directly importing codeformers
- enhance extension compatibility for extensions directly accessing processing params
- **css** fixes
- clearly mark external themes in ui
- update `typing-extensions`
## Update for 2023-10-17
This is a major release, with many changes and new functionality...
Changelog is massive, but do read through or youll be missing on some very cool new functionality
or even free speedups and quality improvements (regardless of which workflows youre using)!
Note that for this release its recommended to perform a clean install (e.g. fresh `git clone`)
Upgrades are still possible and supported, but clean install is recommended for best experience
- **UI**
- added **change log** to UI
see *System -> Changelog*
- converted submenus from checkboxes to accordion elements
any ui state including state of open/closed menus can be saved as default!
see *System -> User interface -> Set menu states*
- new built-in theme **invoked**
thanks @BinaryQuantumSoul
- add **compact view** option in settings -> user interface
- small visual indicator bottom right of page showing internal server job state
- **Extra networks**:
- **Details**
- new details interface to view and save data about extra networks
main ui now has a single button on each en to trigger details view
- details view includes model/lora metadata parser!
- details view includes civitai model metadata!
- **Metadata**:
- you can scan [civitai](https://civitai.com/)
for missing metadata and previews directly from extra networks
simply click on button in top-right corner of extra networks page
- **Styles**
- save/apply icons moved to extra networks
- can be edited in details view
- support for single or multiple styles per json
- support for embedded previews
- large database of art styles included by default
can be disabled in *settings -> extra networks -> show built-in*
- styles can also be used in a prompt directly: `<style:style_name>`
if style if an exact match, it will be used
otherwise it will rotate between styles that match the start of the name
that way you can use different styles as wildcards when processing batches
- styles can have **extra** fields, not just prompt and negative prompt
for example: *"Extra: sampler: Euler a, width: 480, height: 640, steps: 30, cfg scale: 10, clip skip: 2"*
- **VAE**
- VAEs are now also listed as part of extra networks
- Image preview methods have been redesigned: simple, approximate, taesd, full
please set desired preview method in settings
- both original and diffusers backend now support "full quality" setting
if you desired model or platform does not support FP16 and/or you have a low-end hardware and cannot use FP32
you can disable "full quality" in advanced params and it will likely reduce decode errors (infamous black images)
- **LoRA**
- LoRAs are now automatically filtered based on compatibility with currently loaded model
note that if lora type cannot be auto-determined, it will be left in the list
- **Refiner**
- you can load model from extra networks as base model or as refiner
simply select button in top-right of models page
- **General**
- faster search, ability to show/hide/sort networks
- refactored subfolder handling
*note*: this will trigger model hash recalculation on first model use
- **Diffusers**:
- better pipeline **auto-detect** when loading from safetensors
- **SDXL Inpaint**
- although any model can be used for inpainiting, there is a case to be made for
dedicated inpainting models as they are tuned to inpaint and not generate
- model can be used as base model for **img2img** or refiner model for **txt2img**
To download go to *Models -> Huggingface*:
- `diffusers/stable-diffusion-xl-1.0-inpainting-0.1` *(6.7GB)*
- **SDXL Instruct-Pix2Pix**
- model can be used as base model for **img2img** or refiner model for **txt2img**
this model is massive and requires a lot of resources!
to download go to *Models -> Huggingface*:
- `diffusers/sdxl-instructpix2pix-768` *(11.9GB)*
- **SD Latent Upscale**
- you can use *SD Latent Upscale* models as **refiner models**
this is a bit experimental, but it works quite well!
to download go to *Models -> Huggingface*:
- `stabilityai/sd-x2-latent-upscaler` *(2.2GB)*
- `stabilityai/stable-diffusion-x4-upscaler` *(1.7GB)*
- better **Prompt attention**
should better handle more complex prompts
for sdxl, choose which part of prompt goes to second text encoder - just add `TE2:` separator in the prompt
for hires and refiner, second pass prompt is used if present, otherwise primary prompt is used
new option in *settings -> diffusers -> sdxl pooled embeds*
thanks @AI-Casanova
- better **Hires** support for SD and SDXL
- better **TI embeddings** support for SD and SDXL
faster loading, wider compatibility and support for embeddings with multiple vectors
information about used embedding is now also added to image metadata
thanks @AI-Casanova
- better **Lora** handling
thanks @AI-Casanova
- better **SDXL preview** quality (approx method)
thanks @BlueAmulet
- new setting: *settings -> diffusers -> force inpaint*
as some models behave better when in *inpaint* mode even for normal *img2img* tasks
- **Upscalers**:
- pretty much a rewrite and tons of new upscalers - built-in list is now at **42**
- fix long outstanding memory leak in legacy code, amazing this went undetected for so long
- more high quality upscalers available by default
**SwinIR** (2), **ESRGAN** (12), **RealESRGAN** (6), **SCUNet** (2)
- if that is not enough, there is new **chaiNNer** integration:
adds 15 more upscalers from different families out-of-the-box:
**HAT** (6), **RealHAT** (2), **DAT** (1), **RRDBNet** (1), **SPSRNet** (1), **SRFormer** (2), **SwiftSR** (2)
and yes, you can download and add your own, just place them in `models/chaiNNer`
- two additional latent upscalers based on SD upscale models when using Diffusers backend
**SD Upscale 2x**, **SD Upscale 4x***
note: Recommended usage for *SD Upscale* is by using second pass instead of upscaler
as it allows for tuning of prompt, seed, sampler settings which are used to guide upscaler
- upscalers are available in **xyz grid**
- simplified *settings->postprocessing->upscalers*
e.g. all upsamplers share same settings for tiling
- allow upscale-only as part of **txt2img** and **img2img** workflows
simply set *denoising strength* to 0 so hires does not get triggered
- unified init/download/execute/progress code
- easier installation
- **Samplers**:
- moved ui options to submenu
- default list for new installs is now all samplers, list can be modified in settings
- simplified samplers configuration in settings
plus added few new ones like sigma min/max which can highly impact sampler behavior
- note that list of samplers is now *different* since keeping a flat-list of all possible
combinations results in 50+ samplers which is not practical
items such as algorithm (e.g. karras) is actually a sampler option, not a sampler itself
- **CivitAI**:
- civitai model download is now multithreaded and resumable
meaning that you can download multiple models in parallel
as well as resume aborted/incomplete downloads
- civitai integration in *models -> civitai* can now find most
previews AND metadata for most models (checkpoints, loras, embeddings)
metadata is now parsed and saved in *[model].json*
typical hit rate is >95% for models, loras and embeddings
- description from parsed model metadata is used as model description if there is no manual
description file present in format of *[model].txt*
- to enable search for models, make sure all models have set hash values
*Models -> Valida -> Calculate hashes*
- **LoRA**
- new unified LoRA handler for all LoRA types (lora, lyco, loha, lokr, locon, ia3, etc.)
applies to both original and diffusers backend
thanks @AI-Casanova for diffusers port
- for *backend:original*, separate lyco handler has been removed
- **Compute**
- **CUDA**:
- default updated to `torch` *2.1.0* with cuda *12.1*
- testing moved to `torch` *2.2.0-dev/cu122*
- check out *generate context menu -> show nvml* for live gpu stats (memory, power, temp, clock, etc.)
- **Intel Arc/IPEX**:
- tons of optimizations, built-in binary wheels for Windows
i have to say, intel arc/ipex is getting to be quite a player, especially with openvino
thanks @Disty0 @Nuullll
- **AMD ROCm**:
- updated installer to support detect `ROCm` *5.4/5.5/5.6/5.7*
- support for `torch-rocm-5.7`
- **xFormers**:
- default updated to *0.0.23*
- note that latest xformers are still not compatible with cuda 12.1
recommended to use torch 2.1.0 with cuda 11.8
if you attempt to use xformers with cuda 12.1, it will force a full xformers rebuild on install
which can take a very long time and may/may-not work
- added cmd param `--use-xformers` to force usage of exformers
- **GC**:
- custom garbage collect threshold to reduce vram memory usage, thanks @Disty0
see *settings -> compute -> gc*
- **Inference**
- new section in **settings**
- [HyperTile](https://github.com/tfernd/HyperTile): new!
available for *diffusers* and *original* backends
massive (up to 2x) speed-up your generations for free :)
*note: hypertile is not compatible with any extension that modifies processing parameters such as resolution*
thanks @tfernd
- [Free-U](https://github.com/ChenyangSi/FreeU): new!
available for *diffusers* and *original* backends
improve generations quality at no cost (other than finding params that work for you)
*note: temporarily disabled for diffusers pending release of diffusers==0.22*
thanks @ljleb
- [Token Merging](https://github.com/dbolya/tomesd): not new, but updated
available for *diffusers* and *original* backends
speed-up your generations by merging redundant tokens
speed up will depend on how aggressive you want to be with token merging
- **Batch mode**
new option *settings -> inference -> batch mode*
when using img2img process batch, optionally process multiple images in batch in parallel
thanks @Symbiomatrix
- **NSFW Detection/Censor**
- install extension: [NudeNet](https://github.com/vladmandic/sd-extension-nudenet)
body part detection, image metadata, advanced censoring, etc...
works for *text*, *image* and *process* workflows
more in the extension notes
- **Extensions**
- automatic discovery of new extensions on github
no more waiting for them to appear in index!
- new framework for extension validation
extensions ui now shows actual status of extensions for reviewed extensions
if you want to contribute/flag/update extension status, reach out on github or discord
- better overall compatibility with A1111 extensions (up to a point)
- [MultiDiffusion](https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111)
has been removed from list of built-in extensions
you can still install it manually if desired
- [LyCORIS]<https://github.com/KohakuBlueleaf/a1111-sd-webui-lycoris>
has been removed from list of built-in extensions
it is considered obsolete given that all functionality is now built-in
- **General**
- **Startup**
- all main CLI parameters can now be set as environment variable as well
for example `--data-dir <path>` can be specified as `SD_DATADIR=<path>` before starting SD.Next
- **XYZ Grid**
- more flexibility to use selection or strings
- **Logging**
- get browser session info in server log
- allow custom log file destination
see `webui --log`
- when running with `--debug` flag, log is force-rotated
so each `sdnext.log.*` represents exactly one server run
- internal server job state tracking
- **Launcher**
- new `webui.ps1` powershell launcher for windows (old `webui.bat` is still valid)
thanks @em411
- **API**
- add end-to-end example how to use API: `cli/simple-txt2img.js`
covers txt2img, upscale, hires, refiner
- **train.py**
- wrapper script around built-in **kohyas lora** training script
see `cli/train.py --help`
new support for sd and sdxl, thanks @evshiron
new support for full offline mode (without sdnext server running)
- **Themes**
- all built-in themes are fully supported:
- *black-teal (default), light-teal, black-orange, invoked, amethyst-nightfall, midnight-barbie*
- if youre using any **gradio default** themes or a **3rd party** theme or that are not optimized for SD.Next, you may experience issues
default minimal style has been updated for compatibility, but actual styling is completely outside of SD.Next control
## Update for 2023-09-13
Started as a mostly a service release with quite a few fixes, but then...
Major changes how **hires** works as well as support for a very interesting new model [Wuerstchen](https://huggingface.co/blog/wuertschen)
- tons of fixes
- changes to **hires**
- enable non-latent upscale modes (standard upscalers)
- when using latent upscale, hires pass is run automatically
- when using non-latent upscalers, hires pass is skipped by default
enabled using **force hires** option in ui
hires was not designed to work with standard upscalers, but i understand this is a common workflow
- when using refiner, upscale/hires runs before refiner pass
- second pass can now also utilize full/quick vae quality
- note that when combining non-latent upscale, hires and refiner output quality is maximum,
but operations are really resource intensive as it includes: *base->decode->upscale->encode->hires->refine*
- all combinations of: decode full/quick + upscale none/latent/non-latent + hires on/off + refiner on/off
should be supported, but given the number of combinations, issues are possible
- all operations are captured in image metadata
- diffusers:
- allow loading of sd/sdxl models from safetensors without online connectivity
- support for new model: [wuerstchen](https://huggingface.co/warp-ai/wuerstchen)
its a high-resolution model (1024px+) thats ~40% faster than sd-xl with a bit lower resource requirements
go to *models -> huggingface -> search "warp-ai/wuerstchen" -> download*
its nearly 12gb in size, so be patient :)
- minor re-layout of the main ui
- updated **ui hints**
- updated **models -> civitai**
- search and download loras
- find previews for already downloaded models or loras
- new option **inference mode**
- default is standard `torch.no_grad`
new option is `torch.inference_only` which is slightly faster and uses less vram, but only works on some gpus
- new cmdline param `--no-metadata`
skips reading metadata from models that are not already cached
- updated **gradio**
- **styles** support for subfolders
- **css** optimizations
- clean-up **logging**
- capture system info in startup log
- better diagnostic output
- capture extension output
- capture ldm output
- cleaner server restart
- custom exception handling
## Update for 2023-09-06
One week later, another large update!
- system:
- full **python 3.11** support
note that changing python version does require reinstall
and if youre already on python 3.10, really no need to upgrade
- themes:
- new default theme: **black-teal**
- new light theme: **light-teal**
- new additional theme: **midnight-barbie**, thanks @nyxia
- extra networks:
- support for **tags**
show tags on hover, search by tag, list tags, add to prompt, etc.
- **styles** are now also listed as part of extra networks
existing `styles.csv` is converted upon startup to individual styles inside `models/style`
this is stage one of new styles functionality
old styles interface is still available, but will be removed in future
- cache file lists for much faster startup
speedups are 50+% for large number of extra networks
- ui refresh button now refreshes selected page, not all pages
- simplified handling of **descriptions**
now shows on-mouse-over without the need for user interaction
- **metadata** and **info** buttons only show if there is actual content
- diffusers:
- add full support for **textual inversions** (embeddings)
this applies to both sd15 and sdxl
thanks @ai-casanova for porting compel/sdxl code
- mix&match **base** and **refiner** models (*experimental*):
most of those are "because why not" and can result in corrupt images, but some are actually useful
also note that if youre not using actual refiner model, you need to bump refiner steps
as normal models are not designed to work with low step count
and if youre having issues, try setting prompt parser to "fixed attention" as majority of problems
are due to token mismatches when using prompt attention
- any sd15 + any sd15
- any sd15 + sdxl-refiner
- any sdxl-base + sdxl-refiner
- any sdxl-base + any sd15
- any sdxl-base + any sdxl-base
- ability to **interrupt** (stop/skip) model generate
- added **aesthetics score** setting (for sdxl)
used to automatically guide unet towards higher pleasing images
highly recommended for simple prompts
- added **force zeros** setting
create zero-tensor for prompt if prompt is empty (positive or negative)
- general:
- `rembg` remove backgrounds support for **is-net** model
- **settings** now show markers for all items set to non-default values
- **metadata** refactored how/what/when metadata is added to images
should result in much cleaner and more complete metadata
- pre-create all system folders on startup
- handle model load errors gracefully
- improved vram reporting in ui
- improved script profiling (when running in debug mode)
## Update for 2023-08-30
Time for a quite a large update that has been leaking bit-by-bit over the past week or so...
*Note*: due to large changes, it is recommended to reset (delete) your `ui-config.json`
- diffusers:
- support for **distilled** sd models
just go to models/huggingface and download a model, for example:
`segmind/tiny-sd`, `segmind/small-sd`, `segmind/portrait-finetuned`
those are lower quality, but extremely small and fast
up to 50% faster than sd 1.5 and execute in as little as 2.1gb of vram
- general:
- redesigned **settings**
- new layout with separated sections:
*settings, ui config, licenses, system info, benchmark, models*
- **system info** tab is now part of settings
when running outside of sdnext, system info is shown in main ui
- all system and image paths are now relative by default
- add settings validation when performing load/save
- settings tab in ui now shows settings that are changed from default values
- settings tab switch to compact view
- update **gradio** major version
this may result in some smaller layout changes since its a major version change
however, browser page load is now much faster
- optimizations:
- optimize model hashing
- add cli param `--skip-all` that skips all installer checks
use at personal discretion, but it can be useful for bulk deployments
- add model **precompile** option (when model compile is enabled)
- **extra network** folder info caching
results in much faster startup when you have large number of extra networks
- faster **xyz grid** switching
especially when using different checkpoints
- update **second pass** options for clarity
- models:
- civitai download missing model previews
- add **openvino** (experimental) cpu optimized model compile and inference
enable with `--use-openvino`
thanks @disty0
- enable batch **img2img** scale-by workflows
now you can batch process with rescaling based on each individual original image size
- fixes:
- fix extra networks previews
- css fixes
- improved extensions compatibility (e.g. *sd-cn-animation*)
- allow changing **vae** on-the-fly for both original and diffusers backend
## Update for 2023-08-20
Another release thats been baking in dev branch for a while...
- general:
- caching of extra network information to enable much faster create/refresh operations
thanks @midcoastal
- diffusers:
- add **hires** support (*experimental*)
applies to all model types that support img2img, including **sd** and **sd-xl**
also supports all hires upscaler types as well as standard params like steps and denoising strength
when used with **sd-xl**, it can be used with or without refiner loaded
how to enable - there are no explicit checkboxes other than second pass itself:
- hires: upscaler is set and target resolution is not at default
- refiner: if refiner model is loaded
- images save options: *before hires*, *before refiner*
- redo `move model to cpu` logic in settings -> diffusers to be more reliable
note that system defaults have also changed, so you may need to tweak to your liking
- update dependencies
## Update for 2023-08-17
Smaller update, but with some breaking changes (to prepare for future larger functionality)...
- general:
- update all metadata saved with images
see <https://github.com/vladmandic/automatic/wiki/Metadata> for details
- improved **amd** installer with support for **navi 2x & 3x** and **rocm 5.4/5.5/5.6**
thanks @evshiron
- fix **img2img** resizing (applies to *original, diffusers, hires*)
- config change: main `config.json` no longer contains entire configuration
but only differences from defaults (similar to recent change performed to `ui-config.json`)
- diffusers:
- enable **batch img2img** workflows
- original:
- new samplers: **dpm++ 3M sde** (standard and karras variations)
enable in *settings -> samplers -> show samplers*
- expose always/never discard penultimate sigma
enable in *settings -> samplers*
## Update for 2023-08-11
This is a big one thats been cooking in `dev` for a while now, but finally ready for release...
- diffusers:
- **pipeline autodetect**
if pipeline is set to autodetect (default for new installs), app will try to autodetect pipeline based on selected model
this should reduce user errors such as loading **sd-xl** model when **sd** pipeline is selected
- **quick vae decode** as alternative to full vae decode which is very resource intensive
quick decode is based on `taesd` and produces lower quality, but its great for tests or grids as it runs much faster and uses far less vram
disabled by default, selectable in *txt2img/img2img -> advanced -> full quality*
- **prompt attention** for sd and sd-xl
supports both `full parser` and native `compel`
thanks @ai-casanova
- advanced **lora load/apply** methods
in addition to standard lora loading that was recently added to sd-xl using diffusers, now we have
- **sequential apply** (load & apply multiple loras in sequential manner) and
- **merge and apply** (load multiple loras and merge before applying to model)
see *settings -> diffusers -> lora methods*
thanks @hameerabbasi and @ai-casanova
- **sd-xl vae** from safetensors now applies correct config
result is that 3rd party vaes can be used without washed out colors
- options for optimized memory handling for lower memory usage
see *settings -> diffusers*
- general:
- new **civitai model search and download**
native support for civitai, integrated into ui as *models -> civitai*
- updated requirements
this time its a bigger change so upgrade may take longer to install new requirements
- improved **extra networks** performance with large number of networks
## Update for 2023-08-05
Another minor update, but it unlocks some cool new items...
- diffusers:
- vaesd live preview (sd and sd-xl)
- fix inpainting (sd and sd-xl)
- general:
- new torch 2.0 with ipex (intel arc)
- additional callbacks for extensions
enables latest comfyui extension
## Update for 2023-07-30
Smaller release, but IMO worth a post...
- diffusers:
- sd-xl loras are now supported!
- memory optimizations: Enhanced sequential CPU offloading, model CPU offload, FP16 VAE
- significant impact if running SD-XL (for example, but applies to any model) with only 8GB VRAM
- update packages
- minor bugfixes
## Update for 2023-07-26
This is a big one, new models, new diffusers, new features and updated UI...
First, **SD-XL 1.0** is released and yes, SD.Next supports it out of the box!
- [SD-XL Base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0.safetensors)
- [SD-XL Refiner](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/blob/main/sd_xl_refiner_1.0.safetensors)
Also fresh is new **Kandinsky 2.2** model that does look quite nice:
- [Kandinsky Decoder](https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder)
- [Kandinsky Prior](https://huggingface.co/kandinsky-community/kandinsky-2-2-prior)
Actual changelog is:
- general:
- new loading screens and artwork
- major ui simplification for both txt2img and img2img
nothing is removed, but you can show/hide individual sections
default is very simple interface, but you can enable any sections and save it as default in settings
- themes: add additional built-in theme, `amethyst-nightfall`
- extra networks: add add/remove tags to prompt (e.g. lora activation keywords)
- extensions: fix couple of compatibility items
- firefox compatibility improvements
- minor image viewer improvements
- add backend and operation info to metadata
- diffusers:
- were out of experimental phase and diffusers backend is considered stable
- sd-xl: support for **sd-xl 1.0** official model
- sd-xl: loading vae now applies to both base and refiner and saves a bit of vram
- sd-xl: denoising_start/denoising_end
- sd-xl: enable dual prompts
dual prompt is used if set regardless if refiner is enabled/loaded
if refiner is loaded & enabled, refiner prompt will also be used for refiner pass
- primary prompt goes to [OpenAI CLIP-ViT/L-14](https://huggingface.co/openai/clip-vit-large-patch14)
- refiner prompt goes to [OpenCLIP-ViT/bigG-14](https://huggingface.co/laion/CLIP-ViT-bigG-14-laion2B-39B-b160k)
- **kandinsky 2.2** support
note: kandinsky model must be downloaded using model downloader, not as safetensors due to specific model format
- refiner: fix batch processing
- vae: enable loading of pure-safetensors vae files without config
also enable *automatic* selection to work with diffusers
- sd-xl: initial lora support
right now this applies to official lora released by **stability-ai**, support for **kohyas** lora is expected soon
- implement img2img and inpainting (experimental)
actual support and quality depends on model
it works as expected for sd 1.5, but not so much for sd-xl for now
- implement limited stop/interrupt for diffusers
works between stages, not within steps
- add option to save image before refiner pass
- option to set vae upcast in settings
- enable fp16 vae decode when using optimized vae
this pretty much doubles performance of decode step (delay after generate is done)
- original
- fix hires secondary sampler
this now fully obsoletes `fallback_sampler` and `force_hr_sampler_name`
## Update for 2023-07-18
While were waiting for official SD-XL release, heres another update with some fixes and enhancements...
- **global**
- image save: option to add invisible image watermark to all your generated images
disabled by default, can be enabled in settings -> image options
watermark information will be shown when loading image such as in process image tab
also additional cli utility `/cli/image-watermark.py` to read/write/strip watermarks from images
- batch processing: fix metadata saving, also allow to drag&drop images for batch processing
- ui configuration: you can modify all ui default values from settings as usual,
but only values that are non-default will be written to `ui-config.json`
- startup: add cmd flag to skip all `torch` checks
- startup: force requirements check on each server start
there are too many misbehaving extensions that change system requirements
- internal: safe handling of all config file read/write operations
this allows sdnext to run in fully shared environments and prevents any possible configuration corruptions
- **diffusers**:
- sd-xl: remove image watermarks autocreated by 0.9 model
- vae: enable loading of external vae, documented in diffusers wiki
and mix&match continues, you can even use sd-xl vae with sd 1.5 models!
- samplers: add concept of *default* sampler to avoid needing to tweak settings for primary or second pass
note that sampler details will be printed in log when running in debug level
- samplers: allow overriding of sampler beta values in settings
- refiner: fix refiner applying only to first image in batch
- refiner: allow using direct latents or processed output in refiner
- model: basic support for one more model: [UniDiffuser](https://github.com/thu-ml/unidiffuser)
download using model downloader: `thu-ml/unidiffuser-v1`
and set resolution to 512x512
## Update for 2023-07-14
Trying to unify settings for both original and diffusers backend without introducing duplicates...
- renamed **hires fix** to **second pass**
as that is what it actually is, name hires fix is misleading to start with
- actual **hires fix** and **refiner** are now options inside **second pass** section
- obsoleted settings -> sampler -> **force_hr_sampler_name**
it is now part of **second pass** options and it works the same for both original and diffusers backend
which means you can use different scheduler settings for txt2img and hires if you want
- sd-xl refiner will run if its loaded and if second pass is enabled
so you can quickly enable/disable refiner by simply enabling/disabling second pass
- you can mix&match **model** and **refiner**
for example, you can generate image using sd 1.5 and still use sd-xl refiner as second pass
- reorganized settings -> samplers to show which section refers to which backend
- added diffusers **lmsd** sampler
## Update for 2023-07-13
Another big one, but now improvements to both **diffusers** and **original** backends as well plus ability to dynamically switch between them!
- swich backend between diffusers and original on-the-fly
- you can still use `--backend <backend>` and now that only means in which mode app will start,
but you can change it anytime in ui settings
- for example, you can even do things like generate image using sd-xl,
then switch to original backend and perform inpaint using a different model
- diffusers backend:
- separate ui settings for refiner pass with sd-xl
you can specify: prompt, negative prompt, steps, denoise start
- fix loading from pure safetensors files
now you can load sd-xl from safetensors file or from huggingface folder format
- fix kandinsky model (2.1 working, 2.2 was just released and will be soon)
- original backend:
- improvements to vae/unet handling as well as cross-optimization heads
in non-technical terms, this means lower memory usage and higher performance
and you should be able to generate higher resolution images without any other changes
- other:
- major refactoring of the javascript code
includes fixes for text selections and navigation
- system info tab now reports on nvidia driver version as well
- minor fixes in extra-networks
- installer handles origin changes for submodules
big thanks to @huggingface team for great communication, support and fixing all the reported issues asap!
## Update for 2023-07-10
Service release with some fixes and enhancements:
- diffusers:
- option to move base and/or refiner model to cpu to free up vram
- model downloader options to specify model variant / revision / mirror
- now you can download `fp16` variant directly for reduced memory footprint
- basic **img2img** workflow (*sketch* and *inpaint* are not supported yet)
note that **sd-xl** img2img workflows are architecturaly different so it will take longer to implement
- updated hints for settings
- extra networks:
- fix corrupt display on refesh when new extra network type found
- additional ui tweaks
- generate thumbnails from previews only if preview resolution is above 1k
- image viewer:
- fixes for non-chromium browsers and mobile users and add option to download image
- option to download image directly from image viewer
- general
- fix startup issue with incorrect config
- installer should always check requirements on upgrades
## Update for 2023-07-08
This is a massive update which has been baking in a `dev` branch for a while now
- merge experimental diffusers support
*TL;DR*: Yes, you can run **SD-XL** model in **SD.Next** now
For details, see Wiki page: [Diffusers](https://github.com/vladmandic/automatic/wiki/Diffusers)
Note this is still experimental, so please follow Wiki
Additional enhancements and fixes will be provided over the next few days
*Thanks to @huggingface team for making this possible and our internal @team for all the early testing*
Release also contains number of smaller updates:
- add pan & zoom controls (touch and mouse) to image viewer (lightbox)
- cache extra networks between tabs
this should result in neat 2x speedup on building extra networks
- add settings -> extra networks -> do not automatically build extra network pages
speeds up app start if you have a lot of extra networks and you want to build them manually when needed
- extra network ui tweaks
## Update for 2023-07-01
Small quality-of-life updates and bugfixes:
- add option to disallow usage of ckpt checkpoints
- change lora and lyco dir without server restart
- additional filename template fields: `uuid`, `seq`, `image_hash`
- image toolbar is now shown only when image is present
- image `Zip` button gone and its not optional setting that applies to standard `Save` button
- folder `Show` button is present only when working on localhost,
otherwise its replaced with `Copy` that places image URLs on clipboard so they can be used in other apps
## Update for 2023-06-30
A bit bigger update this time, but contained to specific areas...
- change in behavior
extensions no longer auto-update on startup
using `--upgrade` flag upgrades core app as well as all submodules and extensions
- **live server log monitoring** in ui
configurable via settings -> live preview
- new **extra networks interface**
*note: if youre using a 3rd party ui extension for extra networks, it will likely need to be updated to work with new interface*
- display in front of main ui, inline with main ui or as a sidebar
- lazy load thumbnails
drastically reduces load times for large number of extra networks
- auto-create thumbnails from preview images in extra networks in a background thread
significant load time saving on subsequent restarts
- support for info files in addition to description files
- support for variable aspect-ratio thumbnails
- new folder view
- **extensions sort** by trending
- add requirements check for training
## Update for 2023-06-26
- new training tab interface
- redesigned preprocess, train embedding, train hypernetwork
- new models tab interface
- new model convert functionality, thanks @akegarasu
- new model verify functionality
- lot of ipex specific fixes/optimizations, thanks @disty0
## Update for 2023-06-20
This one is less relevant for standard users, but pretty major if youre running an actual server
But even if not, it still includes bunch of cumulative fixes since last release - and going by number of new issues, this is probably the most stable release so far...
(next one is not going to be as stable, but it will be fun :) )
- minor improvements to extra networks ui
- more hints/tooltips integrated into ui
- new dedicated api server
- but highly promising for high throughput server
- improve server logging and monitoring with
- server log file rotation
- ring buffer with api endpoint `/sdapi/v1/log`
- real-time status and load endpoint `/sdapi/v1/system-info/status`
## Update for 2023-06-14
Second stage of a jumbo merge from upstream plus few minor changes...
- simplify token merging
- reorganize some settings
- all updates from upstream: **A1111** v1.3.2 [df004be] *(latest release)*
pretty much nothing major that i havent released in previous versions, but its still a long list of tiny changes
- skipped/did-not-port:
add separate hires prompt: unnecessarily complicated and spread over large number of commits due to many regressions
allow external scripts to add cross-optimization methods: dangerous and i dont see a use case for it so far
load extension info in threads: unnecessary as other optimizations ive already put place perform equally good
- broken/reverted:
sub-quadratic optimization changes
## Update for 2023-06-13
Just a day later and one *bigger update*...
Both some **new functionality** as well as **massive merges** from upstream
- new cache for models/lora/lyco metadata: `metadata.json`
drastically reduces disk access on app startup
- allow saving/resetting of **ui default values**
settings -> ui defaults
- ability to run server without loaded model
default is to auto-load model on startup, can be changed in settings -> stable diffusion
if disabled, model will be loaded on first request, e.g. when you click generate
useful when you want to start server to perform other tasks like upscaling which do not rely on model
- updated `accelerate` and `xformers`
- huge nubmer of changes ported from **A1111** upstream
this was a massive merge, hopefully this does not cause any regressions
and still a bit more pending...
## Update for 2023-06-12
- updated ui labels and hints to improve clarity and provide some extra info
this is 1st stage of the process, more to come...
if you want to join the effort, see <https://github.com/vladmandic/automatic/discussions/1246>
- new localization and hints engine
how hints are displayed can be selected in settings -> ui
- reworked **installer** sequence
as some extensions are loading packages directly from their preload sequence
which was preventing some optimizations to take effect
- updated **settings** tab functionality, thanks @gegell
with real-time monitor for all new and/or updated settings
- **launcher** will now warn if application owned files are modified
you are free to add any user files, but do not modify app files unless youre sure in what youre doing
- add more profiling for scripts/extensions so you can see what takes time
this applies both to initial load as well as execution
- experimental `sd_model_dict` setting which allows you to load model dictionary
from one model and apply weights from another model specified in `sd_model_checkpoint`
results? who am i to judge :)
## Update for 2023-06-05
Few new features and extra handling for broken extensions
that caused my phone to go crazy with notifications over the weekend...
- added extra networks to **xyz grid** options
now you can have more fun with all your embeddings and loras :)
- new **vae decode** method to help with larger batch sizes, thanks @bigdog
- new setting -> lora -> **use lycoris to handle all lora types**
this is still experimental, but the goal is to obsolete old built-in lora module
as it doesnt understand many new loras and built-in lyco module can handle it all
- somewhat optimize browser page loading
still slower than id want, but gradio is pretty bad at this
- profiling of scripts/extensions callbacks
you can now see how much or pre/post processing is done, not just how long generate takes
- additional exception handling so bad exception does not crash main app
- additional background removal models
- some work on bfloat16 which nobody really should be using, but why not ๐
## Update for 2023-06-02
Some quality-of-life improvements while working on larger stuff in the background...
- redesign action box to be uniform across all themes
- add **pause** option next to stop/skip
- redesigned progress bar
- add new built-in extension: **agent-scheduler**
very elegant way to getting full queueing capabilities, thank @artventurdev
- enable more image formats
note: not all are understood by browser so previews and images may appear as blank
unless you have some browser extensions that can handle them
but they are saved correctly. and cant beat raw quality of 32-bit `tiff` or `psd` :)
- change in behavior: `xformers` will be uninstalled on startup if they are not active
if you do have `xformers` selected as your desired cross-optimization method, then they will be used
reason is that a lot of libaries try to blindly import xformers even if they are not selected or not functional
## Update for 2023-05-30
Another bigger one...And more to come in the next few days...
- new live preview mode: taesd
i really like this one, so its enabled as default for new installs
- settings search feature
- new sampler: dpm++ 2m sde
- fully common save/zip/delete (new) options in all tabs
which (again) meant rework of process image tab
- system info tab: live gpu utilization/memory graphs for nvidia gpus
- updated controlnet interface
- minor style changes
- updated lora, swinir, scunet and ldsr code from upstream
- start of merge from a1111 v1.3
## Update for 2023-05-26
Some quality-of-life improvements...
- updated [README](https://github.com/vladmandic/automatic/blob/master/README.md)
- created [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md)
this will be the source for all info about new things moving forward
and cross-posted to [Discussions#99](https://github.com/vladmandic/automatic/discussions/99) as well as discord [announcements](https://discord.com/channels/1101998836328697867/1109953953396957286)
- optimize model loading on startup
this should reduce startup time significantly
- set default cross-optimization method for each platform backend
applicable for new installs only
- `cuda` => Scaled-Dot-Product
- `rocm` => Sub-quadratic
- `directml` => Sub-quadratic
- `ipex` => invokeais
- `mps` => Doggettxs
- `cpu` => Doggettxs
- optimize logging
- optimize profiling
now includes startup profiling as well as `cuda` profiling during generate
- minor lightbox improvements
- bugfixes...i dont recall when was a release with at least several of those
other than that - first stage of [Diffusers](https://github.com/huggingface/diffusers) integration is now in master branch
i dont recommend anyone to try it (and dont even think reporting issues for it)
but if anyone wants to contribute, take a look at [project page](https://github.com/users/vladmandic/projects/1/views/1)
## Update for 2023-05-23
Major internal work with perhaps not that much user-facing to show for it ;)
- update core repos: **stability-ai**, **taming-transformers**, **k-diffusion, blip**, **codeformer**
note: to avoid disruptions, this is applicable for new installs only
- tested with **torch 2.1**, **cuda 12.1**, **cudnn 8.9**
(production remains on torch2.0.1+cuda11.8+cudnn8.8)
- fully extend support of `--data-dir`
allows multiple installations to share pretty much everything, not just models
especially useful if you want to run in a stateless container or cloud instance
- redo api authentication
now api authentication will use same user/pwd (if specified) for ui and strictly enforce it using httpbasicauth
new authentication is also fully supported in combination with ssl for both sync and async calls
if you want to use api programatically, see examples in `cli/sdapi.py`
- add dark/light theme mode toggle
- redo some `clip-skip` functionality
- better matching for vae vs model
- update to `xyz grid` to allow creation of large number of images without creating grid itself
- update `gradio` (again)
- more prompt parser optimizations
- better error handling when importing image settings which are not compatible with current install
for example, when upscaler or sampler originally used is not available
- fixes...amazing how many issues were introduced by porting a1111 v1.20 code without adding almost no new functionality
next one is v1.30 (still in dev) which does bring a lot of new features
## Update for 2023-05-17
This is a massive one due to huge number of changes,
but hopefully it will go ok...
- new **prompt parsers**
select in UI -> Settings -> Stable Diffusion
- **Full**: my new implementation
- **A1111**: for backward compatibility
- **Compel**: as used in ComfyUI and InvokeAI (a.k.a *Temporal Weighting*)
- **Fixed**: for really old backward compatibility
- monitor **extensions** install/startup and
log if they modify any packages/requirements
this is a *deep-experimental* python hack, but i think its worth it as extensions modifying requirements
is one of most common causes of issues
- added `--safe` command line flag mode which skips loading user extensions
please try to use it before opening new issue
- reintroduce `--api-only` mode to start server without ui
- port *all* upstream changes from [A1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui)
up to today - commit hash `89f9faa`
## Update for 2023-05-15
- major work on **prompt parsing**
this can cause some differences in results compared to what youre used to, but its all about fixes & improvements
- prompt parser was adding commas and spaces as separate words and tokens and/or prefixes
- negative prompt weight using `[word:weight]` was ignored, it was always `0.909`
- bracket matching was anything but correct. complex nested attention brackets are now working.
- btw, if you run with `--debug` flag, youll now actually see parsed prompt & schedule
- updated all scripts in `/cli`
- add option in settings to force different **latent sampler** instead of using primary only
- add **interrupt/skip** capabilities to process images
## Update for 2023-05-13
This is mostly about optimizations...
- improved `torch-directml` support
especially interesting for **amd** users on **windows** where **torch+rocm** is not yet available
dont forget to run using `--use-directml` or default is **cpu**
- improved compatibility with **nvidia** rtx 1xxx/2xxx series gpus
- fully working `torch.compile` with **torch 2.0.1**
using `inductor` compile takes a while on first run, but does result in 5-10% performance increase
- improved memory handling
for highest performance, you can also disable aggressive **gc** in settings
- improved performance
especially *after* generate as image handling has been moved to separate thread
- allow per-extension updates in extension manager
- option to reset configuration in settings
## Update for 2023-05-11
- brand new **extension manager**
this is pretty much a complete rewrite, so new issues are possible
- support for `torch` 2.0.1
note that if you are experiencing frequent hangs, this may be a worth a try
- updated `gradio` to 3.29.0
- added `--reinstall` flag to force reinstall of all packages
- auto-recover & re-attempt when `--upgrade` is requested but fails
- check for duplicate extensions
## Update for 2023-05-08
Back online with few updates:
- bugfixes. yup, quite a lot of those
- auto-detect some cpu/gpu capabilities on startup
this should reduce need to tweak and tune settings like no-half, no-half-vae, fp16 vs fp32, etc
- configurable order of top level tabs
- configurable order of scripts in txt2img and img2img
for both, see sections in ui-> settings -> user interface
## Update for 2023-05-04
Again, few days later...
- reviewed/ported **all** commits from **A1111** upstream
some a few are not applicable as i already have alternative implementations
and very few i choose not to implement (save/restore last-known-good-config is a bad hack)
otherwise, were fully up to date (it doesnt show on fork status as code merges were mostly manual due to conflicts)
but...due to sheer size of the updates, this may introduce some temporary issues
- redesigned server restart function
now available and working in ui
actually, since server restart is now a true restart and not ui restart, it can be used much more flexibly
- faster model load
plus support for slower devices via stream-load function (in ui settings)
- better logging
this includes new `--debug` flag for more verbose logging when troubleshooting
## Update for 2023-05-01
Been a bit quieter for last few days as changes were quite significant, but finally here we are...
- Updated core libraries: Gradio, Diffusers, Transformers
- Added support for **Intel ARC** GPUs via Intel OneAPI IPEX (auto-detected)
- Added support for **TorchML** (set by default when running on non-compatible GPU or on CPU)
- Enhanced support for AMD GPUs with **ROCm**
- Enhanced support for Apple **M1/M2**
- Redesigned command params: run `webui --help` for details
- Redesigned API and script processing
- Experimental support for multiple **Torch compile** options
- Improved sampler support
- Google Colab: <https://colab.research.google.com/drive/126cDNwHfifCyUpCCQF9IHpEdiXRfHrLN>
Maintained by <https://github.com/Linaqruf/sd-notebook-collection>
- Fixes, fixes, fixes...
To take advantage of new out-of-the-box tunings, its recommended to delete your `config.json` so new defaults are applied. its not necessary, but otherwise you may need to play with UI Settings to get the best of Intel ARC, TorchML, ROCm or Apple M1/M2.
## Update for 2023-04-27
a bit shorter list as:
- ive been busy with bugfixing
there are a lot of them, not going to list each here.
but seems like critical issues backlog is quieting down and soon i can focus on new features development.
- ive started collaboration with couple of major projects,
hopefully this will accelerate future development.
whats new:
- ability to view/add/edit model description shown in extra networks cards
- add option to specify fallback sampler if primary sampler is not compatible with desired operation
- make clip skip a local parameter
- remove obsolete items from UI settings
- set defaults for AMD ROCm
if you have issues, you may want to start with a fresh install so configuration can be created from scratch
- set defaults for Apple M1/M2
if you have issues, you may want to start with a fresh install so configuration can be created from scratch
## Update for 2023-04-25
- update process image -> info
- add VAE info to metadata
- update GPU utility search paths for better GPU type detection
- update git flags for wider compatibility
- update environment tuning
- update ti training defaults
- update VAE search paths
- add compatibility opts for some old extensions
- validate script args for always-on scripts
fixes: deforum with controlnet
## Update for 2023-04-24
- identify race condition where generate locks up while fetching preview
- add pulldowns to x/y/z script
- add VAE rollback feature in case of NaNs
- use samples format for live preview
- add token merging
- use **Approx NN** for live preview
- create default `styles.csv`
- fix setup not installing `tensorflow` dependencies
- update default git flags to reduce number of warnings
## Update for 2023-04-23
- fix VAE dtype
should fix most issues with NaN or black images
- add built-in Gradio themes
- reduce requirements
- more AMD specific work
- initial work on Apple platform support
- additional PR merges
- handle torch cuda crashing in setup
- fix setup race conditions
- fix ui lightbox
- mark tensorflow as optional
- add additional image name templates
## Update for 2023-04-22
- autodetect which system libs should be installed
this is a first pass of autoconfig for **nVidia** vs **AMD** environments
- fix parse cmd line args from extensions
- only install `xformers` if actually selected as desired cross-attention method
- do not attempt to use `xformers` or `sdp` if running on cpu
- merge tomesd token merging
- merge 23 PRs pending from a1111 backlog (!!)
*expect shorter updates for the next few days as ill be partially ooo*
## Update for 2023-04-20
- full CUDA tuning section in UI Settings
- improve exif/pnginfo metadata parsing
it can now handle 3rd party images or images edited in external software
- optimized setup performance and logging
- improve compatibility with some 3rd party extensions
for example handle extensions that install packages directly from github urls
- fix initial model download if no models found
- fix vae not found issues
- fix multiple git issues
note: if you previously had command line optimizations such as --no-half, those are now ignored and moved to ui settings
## Update for 2023-04-19
- fix live preview
- fix model merge
- fix handling of user-defined temp folders
- fix submit benchmark
- option to override `torch` and `xformers` installer
- separate benchmark data for system-info extension
- minor css fixes
- created initial merge backlog from pending prs on a1111 repo
see #258 for details
## Update for 2023-04-18
- reconnect ui to active session on browser restart
this is one of most frequently asked for items, finally figured it out
works for text and image generation, but not for process as there is no progress bar reported there to start with
- force unload `xformers` when not used
improves compatibility with AMD/M1 platforms
- add `styles.csv` to UI settings to allow customizing path
- add `--skip-git` to cmd flags for power users that want
to skip all git checks and operations and perform manual updates
- add `--disable-queue` to cmd flags that disables Gradio queues (experimental)
this forces it to use HTTP instead of WebSockets and can help on unreliable network connections
- set scripts & extensions loading priority and allow custom priorities
fixes random extension issues:
`ScuNet` upscaler disappearing, `Additional Networks` not showing up on XYZ axis, etc.
- improve html loading order
- remove some `asserts` causing runtime errors and replace with user-friendly messages
- update README.md
## Update for 2023-04-17
- **themes** are now dynamic and discovered from list of available gradio themes on huggingface
its quite a list of 30+ supported themes so far
- added option to see **theme preview** without the need to apply it or restart server
- integrated **image info** functionality into **process image** tab and removed separate **image info** tab
- more installer improvements
- fix urls
- updated github integration
- make model download as optional if no models found
## Update for 2023-04-16
- support for ui themes! to to *settings* -> *user interface* -> "ui theme*
includes 12 predefined themes
- ability to restart server from ui
- updated requirements
- removed `styles.csv` from repo, its now fully under user control
- removed model-keyword extension as overly aggressive
- rewrite of the fastapi middleware handlers
- install bugfixes, hopefully new installer is now ok \
i really want to focus on features and not troubleshooting installer
## Update for 2023-04-15
- update default values
- remove `ui-config.json` from repo, its now fully under user control
- updated extensions manager
- updated locon/lycoris plugin
- enable quick launch by default
- add multidiffusion upscaler extensions
- add model keyword extension
- enable strong linting
- fix circular imports
- fix extensions updated
- fix git update issues
- update github templates
## Update for 2023-04-14
- handle duplicate extensions
- redo exception handler
- fix generate forever
- enable cmdflags compatibility
- change default css font
- fix ti previews on initial start
- enhance tracebacks
- pin transformers version to last known good version
- fix extension loader
## Update for 2023-04-12
This has been pending for a while, but finally uploaded some massive changes
- New launcher
- `webui.bat` and `webui.sh`:
Platform specific wrapper scripts that starts `launch.py` in Python virtual environment
*Note*: Server can run without virtual environment, but it is recommended to use it
This is carry-over from original repo
**If youre unsure which launcher to use, this is the one you want**
- `launch.py`:
Main startup script
Can be used directly to start server in manually activated `venv` or to run it without `venv`
- `installer.py`:
Main installer, used by `launch.py`
- `webui.py`:
Main server script
- New logger
- New exception handler
- Built-in performance profiler
- New requirements handling
- Move of most of command line flags into UI Settings
|